{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Logistic Regression with scikit-learn\n",
    "\n",
    "This is an example of logistic regression in Python with the [scikit-learn module](http://scikit-learn.org/), performed for an [assignment](https://github.com/ajschumacher/gadsdc/tree/master/logistic_assignment) with my [General Assembly Data Science class](https://generalassemb.ly/education/data-science).\n",
    "\n",
    "## Dataset\n",
    "\n",
    "The dataset I chose is the [affairs dataset](http://statsmodels.sourceforge.net/stable/datasets/generated/fair.html) that comes with [Statsmodels](http://statsmodels.sourceforge.net/). It was derived from a survey of women in 1974 by Redbook magazine, in which married women were asked about their participation in extramarital affairs. More information about the study is available in a [1978 paper](http://fairmodel.econ.yale.edu/rayfair/pdf/1978a200.pdf) from the Journal of Political Economy.\n",
    "\n",
    "## Description of Variables\n",
    "\n",
    "The dataset contains 6366 observations of 9 variables:\n",
    "\n",
    "* `rate_marriage`: woman's rating of her marriage (1 = very poor, 5 = very good)\n",
    "* `age`: woman's age\n",
    "* `yrs_married`: number of years married\n",
    "* `children`: number of children\n",
    "* `religious`: woman's rating of how religious she is (1 = not religious, 4 = strongly religious)\n",
    "* `educ`: level of education (9 = grade school, 12 = high school, 14 = some college, 16 = college graduate, 17 = some graduate school, 20 = advanced degree)\n",
    "* `occupation`: woman's occupation (1 = student, 2 = farming/semi-skilled/unskilled, 3 = \"white collar\", 4 = teacher/nurse/writer/technician/skilled, 5 = managerial/business, 6 = professional with advanced degree)\n",
    "* `occupation_husb`: husband's occupation (same coding as above)\n",
    "* `affairs`: time spent in extra-marital affairs\n",
    "\n",
    "## Problem Statement\n",
    "\n",
    "I decided to treat this as a classification problem by creating a new binary variable `affair` (did the woman have at least one affair?) and trying to predict the classification for each woman.\n",
    "\n",
    "Skipper Seabold, one of the primary contributors to Statsmodels, did a similar classification in his [Statsmodels demo](https://github.com/jseabold/pydc) at a [Statistical Programming DC Meetup](http://www.meetup.com/stats-prog-dc/events/173693192/). However, he used Statsmodels for the classification (whereas I'm using scikit-learn), and he treated the occupation variables as continuous (whereas I'm treating them as categorical)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import modules"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import statsmodels.api as sm\n",
    "import matplotlib.pyplot as plt\n",
    "from patsy import dmatrices\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn import metrics\n",
    "from sklearn.cross_validation import cross_val_score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Pre-Processing\n",
    "\n",
    "First, let's load the dataset and add a binary `affair` column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# load dataset\n",
    "dta = sm.datasets.fair.load_pandas().data\n",
    "\n",
    "# add \"affair\" column: 1 represents having affairs, 0 represents not\n",
    "dta['affair'] = (dta.affairs > 0).astype(int)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Exploration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>rate_marriage</th>\n",
       "      <th>age</th>\n",
       "      <th>yrs_married</th>\n",
       "      <th>children</th>\n",
       "      <th>religious</th>\n",
       "      <th>educ</th>\n",
       "      <th>occupation</th>\n",
       "      <th>occupation_husb</th>\n",
       "      <th>affairs</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>affair</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>4.329701</td>\n",
       "      <td>28.390679</td>\n",
       "      <td>7.989335</td>\n",
       "      <td>1.238813</td>\n",
       "      <td>2.504521</td>\n",
       "      <td>14.322977</td>\n",
       "      <td>3.405286</td>\n",
       "      <td>3.833758</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3.647345</td>\n",
       "      <td>30.537019</td>\n",
       "      <td>11.152460</td>\n",
       "      <td>1.728933</td>\n",
       "      <td>2.261568</td>\n",
       "      <td>13.972236</td>\n",
       "      <td>3.463712</td>\n",
       "      <td>3.884559</td>\n",
       "      <td>2.187243</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        rate_marriage        age  yrs_married  children  religious       educ  \\\n",
       "affair                                                                          \n",
       "0            4.329701  28.390679     7.989335  1.238813   2.504521  14.322977   \n",
       "1            3.647345  30.537019    11.152460  1.728933   2.261568  13.972236   \n",
       "\n",
       "        occupation  occupation_husb   affairs  \n",
       "affair                                         \n",
       "0         3.405286         3.833758  0.000000  \n",
       "1         3.463712         3.884559  2.187243  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dta.groupby('affair').mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that on average, women who have affairs rate their marriages lower, which is to be expected. Let's take another look at the `rate_marriage` variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>yrs_married</th>\n",
       "      <th>children</th>\n",
       "      <th>religious</th>\n",
       "      <th>educ</th>\n",
       "      <th>occupation</th>\n",
       "      <th>occupation_husb</th>\n",
       "      <th>affairs</th>\n",
       "      <th>affair</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>rate_marriage</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1.0</th>\n",
       "      <td>33.823232</td>\n",
       "      <td>13.914141</td>\n",
       "      <td>2.308081</td>\n",
       "      <td>2.343434</td>\n",
       "      <td>13.848485</td>\n",
       "      <td>3.232323</td>\n",
       "      <td>3.838384</td>\n",
       "      <td>1.201671</td>\n",
       "      <td>0.747475</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2.0</th>\n",
       "      <td>30.471264</td>\n",
       "      <td>10.727011</td>\n",
       "      <td>1.735632</td>\n",
       "      <td>2.330460</td>\n",
       "      <td>13.864943</td>\n",
       "      <td>3.327586</td>\n",
       "      <td>3.764368</td>\n",
       "      <td>1.615745</td>\n",
       "      <td>0.635057</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3.0</th>\n",
       "      <td>30.008056</td>\n",
       "      <td>10.239174</td>\n",
       "      <td>1.638469</td>\n",
       "      <td>2.308157</td>\n",
       "      <td>14.001007</td>\n",
       "      <td>3.402820</td>\n",
       "      <td>3.798590</td>\n",
       "      <td>1.371281</td>\n",
       "      <td>0.550856</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4.0</th>\n",
       "      <td>28.856601</td>\n",
       "      <td>8.816905</td>\n",
       "      <td>1.369536</td>\n",
       "      <td>2.400981</td>\n",
       "      <td>14.144514</td>\n",
       "      <td>3.420161</td>\n",
       "      <td>3.835861</td>\n",
       "      <td>0.674837</td>\n",
       "      <td>0.322926</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5.0</th>\n",
       "      <td>28.574702</td>\n",
       "      <td>8.311662</td>\n",
       "      <td>1.252794</td>\n",
       "      <td>2.506334</td>\n",
       "      <td>14.399776</td>\n",
       "      <td>3.454918</td>\n",
       "      <td>3.892697</td>\n",
       "      <td>0.348174</td>\n",
       "      <td>0.181446</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     age  yrs_married  children  religious       educ  \\\n",
       "rate_marriage                                                           \n",
       "1.0            33.823232    13.914141  2.308081   2.343434  13.848485   \n",
       "2.0            30.471264    10.727011  1.735632   2.330460  13.864943   \n",
       "3.0            30.008056    10.239174  1.638469   2.308157  14.001007   \n",
       "4.0            28.856601     8.816905  1.369536   2.400981  14.144514   \n",
       "5.0            28.574702     8.311662  1.252794   2.506334  14.399776   \n",
       "\n",
       "               occupation  occupation_husb   affairs    affair  \n",
       "rate_marriage                                                   \n",
       "1.0              3.232323         3.838384  1.201671  0.747475  \n",
       "2.0              3.327586         3.764368  1.615745  0.635057  \n",
       "3.0              3.402820         3.798590  1.371281  0.550856  \n",
       "4.0              3.420161         3.835861  0.674837  0.322926  \n",
       "5.0              3.454918         3.892697  0.348174  0.181446  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dta.groupby('rate_marriage').mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An increase in `age`, `yrs_married`, and `children` appears to correlate with a declining marriage rating."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Visualization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# show plots in the notebook\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's start with histograms of education and marriage rating."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.text.Text at 0xbeef3c8>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAAEWCAYAAACe8xtsAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAG6tJREFUeJzt3X+UXWV97/H3h/ArECHE6IiQGqRBG4ggibnUSp2IV6KA\nwa6KoVGCpcQKcsUbK4HrLfT2xpVagYJc0FhY4VcJARSCkSpQB2o1DQmCIfxoIglCiEkVQhiggYTv\n/WM/I4dhzsx5MmfPPjP5vNY66+z97H32/j5nZs5n9o+ztyICMzOzHLtUXYCZmQ0+Dg8zM8vm8DAz\ns2wODzMzy+bwMDOzbA4PMzPL5vCwSklaJam96jqqJOkTkp6U1Cnpvf1cVrukp5pVW8Z6z5P0jwO9\nXquOw8NKI2mdpA93aztV0k+6xiPi0Ijo6GM5YyWFpF1LKrVq3wC+EBEjIuLn3Semvr+QwqXr8ZUK\n6uyq5w0BFRFfi4i/qKomG3hD9Y/RrGGSdo2IbRWW8A5gVR/zHB4RawaiGLNGeMvDKlW7dSJpsqTl\nkrZI2ijpojTbvel5c/qv+w8l7SLpq5KekLRJ0jWS9q1Z7ilp2m8l/e9u67lA0s2SrpO0BTg1rftn\nkjZL2iDpMkm71ywvJJ0habWk5yX9raSDJf001buodv5ufeyxVkl7SOoEhgEPSvrlDrx/wyUtkPSs\npIeB93WbHpJ+v2Z8gaT/WzM+TdIDqQ+/lDQ1tX9W0iOpr49L+lxq3xu4A3h7zVbQ29N7el3Ncj+e\ndklultQh6Q9qpq2T9GVJv5D0nKQbJe2Z23erlsPDWsklwCURsQ9wMLAotf9xeh6Zdu38DDg1PaYA\n7wRGAJcBSBoPXA7MAPYH9gUO6LauacDNwEjgemA78CVgNPCHwDHAGd1ecywwETgK+AowH/g0MAY4\nDDi5Tr96rDUitkbEiDTP4RFxcP23pq7zKd6rg1N9Mxt9oaTJwDXAX1G8D38MrEuTNwHHA/sAnwUu\nlnRkRLwAfBR4Ov0sRkTE092WewhwA3A28BbgB8Dt3cL1JGAqcBDwHor3xwYRh4eV7db03+dmSZsp\nPtTreQX4fUmjI6IzIpb2Mu8M4KKIeDwiOoFzgenpuMifArdHxE8i4mXgr4HuF3H7WUTcGhGvRsRL\nEbEiIpZGxLaIWAd8G/hgt9d8PSK2RMQq4CHgR2n9z1H8N17vYHdvtTbq/tr3UdKxqf0kYG5EPBMR\nTwKXZizzNOCqiLgzvQ/rI+JRgIhYEhG/jMI9wI+Aoxtc7qeAJWm5r1Ac0xkOvL9mnksj4umIeAa4\nHTgio25rAQ4PK9uJETGy68Eb/5uvdRpwCPCopPskHd/LvG8HnqgZf4LiGF5bmvZk14SIeBH4bbfX\nP1k7IukQSd+X9Ou0K+trFFshtTbWDL/Uw/gIetZbrY06svZ9jIgf1iy7ti9P9PDaesYAPe4qk/RR\nSUslPZNC/2O88f2o53X9jYhXU421W3+/rhl+kfrvnbUoh4e1jIhYHREnA28F/g64Oe1j7+nSz09T\nHGju8nvANooP9A3AgV0TJA0H3tx9dd3GrwAeBcal3WbnAdrx3jRca39toAiB2mXXehHYq2b8bTXD\nT1Ls7nodSXsAt1BsMbSl0P8Br70ffV2K+3X9laRU4/o+XmeDiMPDWoakT0t6S/pPdXNqfhX4z/T8\nzprZbwC+JOkgSSMothRuTGdN3QycIOn9aT/7BfQdBG8CtgCdkt4NfL5Z/eqj1v5aBJwraT9JBwJn\ndZv+APBnkoalg+G1u+KuBD4r6Zh0UP+A1PfdgT0o3vdtkj4KfKTmdRuBN9eeoNBDTcel5e4GzAa2\nAj/tZ1+thTg8rJVMBValM5AuAaan4xEvAnOBf0v7+48CrgKupTgTay3wX6QPznRM4ixgIcV/5p0U\nB4C39rLuLwN/BjwPfAe4sYn9qltrhgf1+u95/ENq/xuKXURrKY5LXNvtdV8ETqAI4xnArV0TImIZ\n6WA48BxwD/COiHge+B8UIfAsxfuyuOZ1j1IE4uPp5/H22hVGxGMUJxJ8E/hNWv8J6fiTDRHyzaBs\nqEv/7W+m2CW1tup6zIYCb3nYkCTpBEl7pWMm3wBW8tppqGbWTw4PG6qmURy4fRoYR7ELzJvZZk3i\n3VZmZpbNWx5mZpZtyF4YcfTo0TF27Niqy2jICy+8wN577111GaVw3wavodw/962+FStW/CYi3tLX\nfEM2PMaOHcvy5curLqMhHR0dtLe3V11GKdy3wWso9899q09SQ1cp8G4rMzPL5vAwM7NsDg8zM8vm\n8DAzs2wODzMzy+bwMDOzbA4PMzPL5vAwM7NsDg8zM8s2ZL9hbtaXsXOW9Ov1syds49QdWMa6ecf1\na71mrcBbHmZmls3hYWZm2RweZmaWzeFhZmbZHB5mZpbN4WFmZtkcHmZmls3hYWZm2RweZmaWzeFh\nZmbZHB5mZpbN4WFmZtkcHmZmls3hYWZm2RweZmaWzeFhZmbZfDMoA/p/Y6R6+rphkm+MZDY4ecvD\nzMyyOTzMzCybw8PMzLI5PMzMLFtp4SFpjKQfS3pY0ipJX0ztoyTdKWl1et6v5jXnSloj6TFJx9a0\nT5S0Mk27VJLKqtvMzPpW5pbHNmB2RIwHjgLOlDQemAPcHRHjgLvTOGnadOBQYCpwuaRhaVlXAKcD\n49Jjaol1m5lZH0oLj4jYEBH3p+HngUeAA4BpwNVptquBE9PwNGBhRGyNiLXAGmCypP2BfSJiaUQE\ncE3Na8zMrAIqPo9LXok0FrgXOAz4VUSMTO0Cno2IkZIuA5ZGxHVp2pXAHcA6YF5EfDi1Hw2cExHH\n97CeWcAsgLa2tokLFy4suWfN0dnZyYgRIyqtYeX650pZbttw2PhS/ekTDti3lPU2or997qtv9VTZ\n5xyt8HtZFvetvilTpqyIiEl9zVf6lwQljQBuAc6OiC21hysiIiQ1Lb0iYj4wH2DSpEnR3t7erEWX\nqqOjg6pr7e2LfP0xe8I2LlxZ/9ds3Yz2UtbbiP72ua++1VNln3O0wu9lWdy3/iv1bCtJu1EEx/UR\n8d3UvDHtiiI9b0rt64ExNS8/MLWtT8Pd283MrCJlnm0l4ErgkYi4qGbSYmBmGp4J3FbTPl3SHpIO\nojgwviwiNgBbJB2VlnlKzWvMzKwCZe62+iPgM8BKSQ+ktvOAecAiSacBTwAnAUTEKkmLgIcpztQ6\nMyK2p9edASwAhlMcB7mjxLrNzKwPpYVHRPwEqPd9jGPqvGYuMLeH9uUUB9vNzKwF+BvmZmaWzeFh\nZmbZHB5mZpbN4WFmZtkcHmZmls3hYWZm2RweZmaWzeFhZmbZHB5mZpbN4WFmZtkcHmZmls3hYWZm\n2RweZmaWzeFhZmbZHB5mZpbN4WFmZtkcHmZmls3hYWZm2RweZmaWzeFhZmbZHB5mZpbN4WFmZtkc\nHmZmls3hYWZm2RweZmaWzeFhZmbZHB5mZpbN4WFmZtkcHmZmls3hYWZm2RweZmaWzeFhZmbZHB5m\nZpbN4WFmZtkcHmZmls3hYWZm2UoLD0lXSdok6aGatgskrZf0QHp8rGbauZLWSHpM0rE17RMlrUzT\nLpWksmo2M7PGlLnlsQCY2kP7xRFxRHr8AEDSeGA6cGh6zeWShqX5rwBOB8alR0/LNDOzAVRaeETE\nvcAzDc4+DVgYEVsjYi2wBpgsaX9gn4hYGhEBXAOcWE7FZmbWqF0rWOdZkk4BlgOzI+JZ4ABgac08\nT6W2V9Jw9/YeSZoFzAJoa2ujo6OjuZWXpLOzs/JaZ0/YVspy24b3vuwq+93fPvfVt3qq/lk3qhV+\nL8vivvXfQIfHFcDfApGeLwT+vFkLj4j5wHyASZMmRXt7e7MWXaqOjg6qrvXUOUtKWe7sCdu4cGX9\nX7N1M9pLWW8j+tvnvvpWT5V9ztEKv5dlcd/6b0DPtoqIjRGxPSJeBb4DTE6T1gNjamY9MLWtT8Pd\n283MrEIDGh7pGEaXTwBdZ2ItBqZL2kPSQRQHxpdFxAZgi6Sj0llWpwC3DWTNZmb2Rg1tc0uaEBEr\ncxYs6QagHRgt6SngfKBd0hEUu63WAZ8DiIhVkhYBDwPbgDMjYnta1BkUZ24NB+5IDzMzq1CjO2wv\nl7QHxYf49RHxXF8viIiTe2i+spf55wJze2hfDhzWYJ1mZjYAGtptFRFHAzMojkuskPRPkv57qZWZ\nmVnLaviYR0SsBr4KnAN8ELhU0qOS/qSs4szMrDU1FB6S3iPpYuAR4EPACRHxB2n44hLrMzOzFtTo\nMY9vAv8InBcRL3U1RsTTkr5aSmVmZtayGg2P44CXus6AkrQLsGdEvBgR15ZWnZmZtaRGj3ncRXGq\nbJe9UpuZme2EGg2PPSOis2skDe9VTklmZtbqGg2PFyQd2TUiaSLwUi/zm5nZENboMY+zgZskPQ0I\neBvwqdKqMjOzltZQeETEfZLeDbwrNT0WEa+UV5aZmbWynOtJvw8Ym15zpCQi4ppSqjIzs5bW6IUR\nrwUOBh4Aui5Y2HVnPzMz28k0uuUxCRifbgVrZmY7uUbPtnqI4iC5mZlZw1seo4GHJS0DtnY1RsTH\nS6nKzMxaWqPhcUGZRZiZ2eDS6Km690h6BzAuIu6StBcwrNzSzMysVTV6SfbTgZuBb6emA4BbyyrK\nzMxaW6MHzM8E/gjYAr+7MdRbyyrKzMxaW6PhsTUiXu4akbQrxfc8zMxsJ9RoeNwj6TxgeLp3+U3A\n7eWVZWZmrazR8JgD/CewEvgc8AOK+5mbmdlOqNGzrV4FvpMeZma2k2v02lZr6eEYR0S8s+kVmZlZ\ny8u5tlWXPYFPAqOaX46ZmQ0GDR3ziIjf1jzWR8Q/AMeVXJuZmbWoRndbHVkzugvFlkjOvUDMzGwI\naTQALqwZ3gasA05qejVmZjYoNHq21ZSyCzEzs8Gj0d1W/7O36RFxUXPKMbMyjZ2zpOF5Z0/YxqkZ\n8/dm3TwfIh1qcs62eh+wOI2fACwDVpdRlJmZtbZGw+NA4MiIeB5A0gXAkoj4dFmFmZlZ62r08iRt\nwMs14y+nNjMz2wk1uuVxDbBM0vfS+InA1eWUZGZmra7Rs63mSroDODo1fTYifl5eWWZm1soa3W0F\nsBewJSIuAZ6SdFBJNZmZWYtr9Da05wPnAOempt2A68oqyszMWlujWx6fAD4OvAAQEU8Db+rtBZKu\nkrRJ0kM1baMk3SlpdXrer2bauZLWSHpM0rE17RMlrUzTLpWknA6amVnzNRoeL0dEkC7LLmnvBl6z\nAJjarW0OcHdEjAPuTuNIGg9MBw5Nr7lc0rD0miuA04Fx6dF9mWZmNsAaDY9Fkr4NjJR0OnAXfdwY\nKiLuBZ7p1jyN187SuprirK2u9oURsTUi1gJrgMmS9gf2iYilKbyuqXmNmZlVRMVncgMzFvcu/wgg\n4IcRcWcDrxkLfD8iDkvjmyNiZBoW8GxEjJR0GbA0Iq5L064E7qC4AOO8iPhwaj8aOCcijq+zvlnA\nLIC2traJCxcubKhvVevs7GTEiBGV1rBy/XOlLLdtOGx8qf70CQfsW8p6G9HfPvfVt3oGS593tH89\nqbLPPWmFv7my9LdvU6ZMWRERk/qar89TddPuo7vSxRH7DIxGRURIaiy5Gl/mfGA+wKRJk6K9vb2Z\niy9NR0cHVdfarGsYdTd7wjYuXFn/12zdjPZS1tuI/va5r77VM1j6vKP960mVfe5JK/zNlWWg+tbn\nbquI2A68KqkZ/zpsTLuiSM+bUvt6YEzNfAemtvVpuHu7mZlVqNFjHp3ASklXpjOeLpV06Q6sbzEw\nMw3PBG6raZ8uaY/0/ZFxwLKI2ABskXRU2s11Ss1rzMysIo1uk343PRom6QagHRgt6SngfGAexcH3\n04AnSDeUiohVkhYBD1PcbOrMtMUDcAbFmVvDKY6D3JFTh5mZNV+v4SHp9yLiVxGRfR2riDi5zqRj\n6sw/F5jbQ/ty4LDc9ZuZWXn62m11a9eApFtKrsXMzAaJvsKj9tvc7yyzEDMzGzz6Co+oM2xmZjux\nvg6YHy5pC8UWyPA0TBqPiNin1OrMzKwl9RoeETGst+lmZrZzyrmfh5mZGeDwMDOzHeDwMDOzbA4P\nMzPL5vAwM7NsDg8zM8vm8DAzs2wODzMzy+bwMDOzbA4PMzPL5vAwM7NsDg8zM8vm8DAzs2wODzMz\ny+bwMDOzbA4PMzPL5vAwM7NsDg8zM8vm8DAzs2wODzMzy+bwMDOzbA4PMzPL5vAwM7NsDg8zM8vm\n8DAzs2wODzMzy+bwMDOzbA4PMzPL5vAwM7NsDg8zM8vm8DAzs2yVhIekdZJWSnpA0vLUNkrSnZJW\np+f9auY/V9IaSY9JOraKms3M7DVVbnlMiYgjImJSGp8D3B0R44C70ziSxgPTgUOBqcDlkoZVUbCZ\nmRV2rbqAGtOA9jR8NdABnJPaF0bEVmCtpDXAZOBnFdRoZjtg7Jwllax33bzjKlnvzkARMfArldYC\nzwHbgW9HxHxJmyNiZJou4NmIGCnpMmBpRFyXpl0J3BERN/ew3FnALIC2traJCxcuHKAe9U9nZycj\nRoyotIaV658rZbltw2HjS/WnTzhg31LW24j+9rmvvtUzWPq8o/1rJfXe61b4mytLf/s2ZcqUFTV7\nhOqqasvjAxGxXtJbgTslPVo7MSJCUnaqRcR8YD7ApEmTor29vSnFlq2jo4Oqaz21pP8MZ0/YxoUr\n6/+arZvRXsp6G9HfPvfVt3oGS593tH+tpN573Qp/c2UZqL5VcswjItan503A9yh2Q22UtD9Aet6U\nZl8PjKl5+YGpzczMKjLg4SFpb0lv6hoGPgI8BCwGZqbZZgK3peHFwHRJe0g6CBgHLBvYqs3MrFYV\n26RtwPeKwxrsCvxTRPyzpPuARZJOA54ATgKIiFWSFgEPA9uAMyNiewV1m5lZMuDhERGPA4f30P5b\n4Jg6r5kLzC25NDMza5C/YW5mZtkcHmZmls3hYWZm2RweZmaWzeFhZmbZHB5mZpbN4WFmZtkcHmZm\nls3hYWZm2RweZmaWbXBfb9nMrEVVdQOsBVP3HpD1eMvDzMyyOTzMzCybw8PMzLI5PMzMLJvDw8zM\nsjk8zMwsm8PDzMyyOTzMzCybw8PMzLI5PMzMLJvDw8zMsjk8zMwsm8PDzMyyOTzMzCybw8PMzLI5\nPMzMLJvDw8zMsjk8zMwsm8PDzMyyOTzMzCybw8PMzLI5PMzMLJvDw8zMsjk8zMws265VF9CKxs5Z\nMqDrmz1hG6fOWcK6eccN6HrNzHaUtzzMzCzboAkPSVMlPSZpjaQ5VddjZrYzGxThIWkY8P+AjwLj\ngZMlja+2KjOzndegCA9gMrAmIh6PiJeBhcC0imsyM9tpKSKqrqFPkv4UmBoRf5HGPwP8t4j4Qrf5\nZgGz0ui7gMcGtNAdNxr4TdVFlMR9G7yGcv/ct/reERFv6WumIXW2VUTMB+ZXXUcuScsjYlLVdZTB\nfRu8hnL/3Lf+Gyy7rdYDY2rGD0xtZmZWgcESHvcB4yQdJGl3YDqwuOKazMx2WoNit1VEbJP0BeCH\nwDDgqohYVXFZzTTodrVlcN8Gr6HcP/etnwbFAXMzM2stg2W3lZmZtRCHh5mZZXN4DCBJV0naJOmh\nmrZRku6UtDo971dljf1Rp39/L+lRSb+Q9D1JI6uscUf11LeaabMlhaTRVdTWX/X6Jums9LNbJenr\nVdXXX3V+L4+QtFTSA5KWS5pcZY07StIYST+W9HD6OX0xtZf+ueLwGFgLgKnd2uYAd0fEOODuND5Y\nLeCN/bsTOCwi3gP8B3DuQBfVJAt4Y9+QNAb4CPCrgS6oiRbQrW+SplBcxeHwiDgU+EYFdTXLAt74\ns/s68DcRcQTw12l8MNoGzI6I8cBRwJnp0k2lf644PAZQRNwLPNOteRpwdRq+GjhxQItqop76FxE/\niohtaXQpxXd0Bp06PzuAi4GvAIP2zJM6ffs8MC8itqZ5Ng14YU1Sp38B7JOG9wWeHtCimiQiNkTE\n/Wn4eeAR4AAG4HPF4VG9tojYkIZ/DbRVWUzJ/hy4o+oimkXSNGB9RDxYdS0lOAQ4WtK/S7pH0vuq\nLqjJzgb+XtKTFFtVg3WL+HckjQXeC/w7A/C54vBoIVGcNz1o/4PtjaT/RbGJfX3VtTSDpL2A8yh2\neQxFuwKjKHaF/BWwSJKqLampPg98KSLGAF8Crqy4nn6RNAK4BTg7IrbUTivrc8XhUb2NkvYHSM+D\ndvdAPZJOBY4HZsTQ+WLRwcBBwIOS1lHsjrtf0tsqrap5ngK+G4VlwKsUF9wbKmYC303DN1FcuXtQ\nkrQbRXBcHxFdfSr9c8XhUb3FFL/IpOfbKqyl6SRNpTgm8PGIeLHqepolIlZGxFsjYmxEjKX4sD0y\nIn5dcWnNciswBUDSIcDuDK2r0D4NfDANfwhYXWEtOyxtDV4JPBIRF9VMKv9zJSL8GKAHcAOwAXiF\n4sPmNODNFGdDrAbuAkZVXWeT+7cGeBJ4ID2+VXWdzepbt+nrgNFV19nEn9vuwHXAQ8D9wIeqrrPJ\n/fsAsAJ4kOIYwcSq69zBvn2AYpfUL2r+xj42EJ8rvjyJmZll824rMzPL5vAwM7NsDg8zM8vm8DAz\ns2wODzMzy+bwsCFJ0vZ0xdSuxxsuDCepXdL3m7zedknvrxn/S0mnNGG5Y3u6om+zSLpA0pfLWr4N\nPYPiNrRmO+ClKK6YOtDagU7gpwAR8a0KajArnbc8bKciaWq6R8X9wJ/UtL/uP29JD6ULzSHplHQ/\nkgclXZvaTkgXDfy5pLsktaX5/xL4UtraObp2uTX3kOi6t8l+qb1D0t9JWibpPyQdndGfgyX9s6QV\nkv5V0rsl7SvpCUm7pHn2lvSkpN16mr+fb6ntpBweNlQN77bb6lOS9gS+A5wATAT6vA6VpEOBr1J8\nw/pw4Itp0k+AoyLivcBC4CsRsQ74FnBxRBwREf/abXHXAOdEcW+TlcD5NdN2jYjJFFd7PZ/GzQfO\nioiJwJeByyPiOYpvGnddfuN44IcR8UpP82esy+x3vNvKhqo37LaSdASwNiJWp/HrgFl9LOdDwE0R\n8RuAiOi6L8SBwI3ponO7A2t7W4ikfYGREXFParqa4oJ8XbouaLcCGNtHTV3LHAG8H7ip5oK3e6Tn\nG4FPAT8GpgOX9zG/WRaHh1lhG6/fEt+zj/m/CVwUEYsltQMX9HP9W9Pzdhr/u9wF2Fzn2M5i4GuS\nRlFsZf0LsHcv85tl8W4r25k8CoyVdHAaP7lm2jrgSABJR1Jcbh2KD91PSnpzmjYqte8LrE/DM19b\nDM8Db+q+4rQr6dma4xmfAe7pPl+OKO7bsFbSJ1NtknR4mtYJ3AdcAnw/Irb3Nr9ZLoeHDVXdj3nM\ni4j/othNtSQdMK+9x8EtwChJq4AvUNxvnYhYBcwF7pH0INB12esLKHb/rOD1lyq/HfhE1wHzbjXN\npLh73S+AI4D/k9mnd0l6qubxSWAGcFqqbRXF7Ue73Ah8Oj136W1+s4b5qrpmZpbNWx5mZpbN4WFm\nZtkcHmZmls3hYWZm2RweZmaWzeFhZmbZHB5mZpbt/wMEjr5ipA6NqwAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0xbd31240>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# histogram of education\n",
    "dta.educ.hist()\n",
    "plt.title('Histogram of Education')\n",
    "plt.xlabel('Education Level')\n",
    "plt.ylabel('Frequency')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.text.Text at 0xbff9898>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAAEWCAYAAACe8xtsAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XuYHVWZ7/Hvj4RLSEOCBtsYkCBGnECUIT0RQbSjjAYB\ngRkv4aBcRIIDHEGZRy6Po/g4cXCOeEEGJAoHECREFIlc9AASETVigkhIgCFKGIgx4ZaEBgQD7/mj\nVkPRdPfea3dX793J7/M8++mqVbWq3lq9e7+7VlWvUkRgZmaWY7NmB2BmZsOPk4eZmWVz8jAzs2xO\nHmZmls3Jw8zMsjl5mJlZNicPa4ikpZI6mx1HM0k6VNJDkrok/X2TY9lX0n3NjGGwpPZ8Q7PjsP45\nedgrSFohab8eZUdJuq17PiJ2i4gFNbYzUVJIGllRqM32VeDEiGiLiN/3XJiOfU35+CVtnsoG9R+s\nIuKXEbHrYG6zHpLOlPS39IG/VtKvJb09o/4CSZ8ol6X2/NPgR2uDycnDhq0WSEo7AUtrrPMEsH9p\nfv9U1pDejrkF2uHKiGgDxgG3AD9ocjw2BJw8rCHlsxNJ0yQtkrRe0mpJX0ur3Zp+rk3fTN8uaTNJ\nn5P0YPoGfqmkMaXtHpGWPSbp33rs50xJV0m6TNJ64Ki079+kb72rJJ0raYvS9kLS8ZLul/SkpC9J\n2iV9Q14vaV55/R7H2GuskraU1AWMAP4g6Y/9NNX3gCNK80cAl/bYz9GS7knx/UnScaVlnZIelnSq\npL8A/7e/slK90yT9MW1zmaRDS8tGSDpb0qOSHpB0YvkMMR3jhak9V0r6d0kj+jlGACJiA3A5MEHS\n9mlb20m6VtIjkp5I0zukZbOBfYFz0/vj3NLv7I1p+mJJ/yXpunQsv5W0S+lY3ivpPknrJJ0n6Rc9\nz2SsIhHhl18vewErgP16lB0F3NbbOsBvgI+l6TZgrzQ9EQhgZKnex4HlwBvSuj8CvpeWTQa6gHcA\nW1B0C/2ttJ8z0/whFF98RgFTgb2AkWl/9wAnl/YXwDXAtsBuwLPAzWn/Y4BlwJF9tEOfsZa2/cZ+\n2jGA3YHVwFhguzS9e/Gn9+J6BwC7AALeBTwN7JmWdQIbgK8AW6Zj7qvs4dI2PwS8LrXTR4CngPFp\n2SfTce+QYrqp/HsCrgYuAEYDrwFuB47r4xjPBC5L01sAZwGPlrb1auCfga2BbSjOSn5cqr8A+EQv\n7fbGNH0x8BgwLf2OLwfmpmXjgPXAP6VlJ6X3xyf6+p34NYifE80OwK/We1Ekhi5gben1NH0nj1uB\nLwLjemxnIq9MHjcDx5fmd01/8COBzwNXlJZtDTzHy5PHrTViPxm4ujQfwD6l+cXAqaX5s4Fv9LGt\nPmMtbbtW8ngj8F3guPSh/Z1UFv3U+zFwUpruTG2wVWl5X2UP97PNO4GD0/TPKSUDYL/u3xPQTpFg\nR5WWHwbc0sd2z0yxrAWeTx/0nf3EsQfwRGl+Qc8Pe16ZPL5bWvZ+4N40fQTwm9IyAQ/13J5f1bzc\nbWV9OSQixna/gOP7WfcY4E3AvZJ+J+nAftZ9HfBgaf5BXvrQeh3FHz8AEfE0xYdR2UPlGUlvSl0h\nf0ldWV+m+EZatro0/Uwv820NxJrjUooPuld0WQFI2l/SQkmPS1pL8QFZPoZHIuKvPar1Vlbe5hGS\n7kzdeWspzna6t/mydu4xvROwObCqVPcCijOQvsxL75F24G6Ks8HuOLaWdEHq+ltP8UVjbD3dYCV/\nKU0/zUu/r57vlwAexoaEk4cNWETcHxGHUXzAfAW4StJoim+QPf2Z4gOq2+spumBWA6soulIAkDSK\notvjZbvrMX8+cC8wKSK2Bc6g+AY6GPqLNccvgfEUH663lRdI2hL4IUUXXXv6EL6elx9Db+3Y591a\nknaiOMM5EXh12ubdpW2+rJ2BHUvTD1GceYwrfXnYNiJ2q3WQEfEoMAs4U9L4VHwKxRnb29Lv553d\nYdY6jjr0fL+Ilx+XVcjJwwZM0kclbR8RL1B0XwC8ADySfpbv2b8C+LSknSW1UZwpXBnFxdargIMk\n7Z0uYp9J7USwDUW/d5ekNwP/MljHVSPWuqVvxAcBH0jTZVtQXLd4BNggaX/gvQOMuztxPwLFBXmK\nM49u84CTJE2QNBY4tRTrKuD/AWdL2jbdNLCLpHfVs+OIuA/4GfDZVLQNxdndWkmvAr7Qo8pqXv7+\nyHEdMEXSIeli/wnAaxvclmVy8rDBMANYmu5A+iYwMyKeSd1Os4FfpS6QvYCLKO5AuhV4APgr8L8B\nImJpmp5L8a2yC1hD8U24L/8K/C/gSYpv21cO4nH1GWuuiFiajq9n+ZPApyg+0J+gOJb5jQactrmM\n4lrObyg+nKcAvyqt8h2KBHEX8HuKM50NFNcsoOhe24LiovoTFEl9PPX7P8AsSa8BvkFxQf9RYCHw\n0x7rfhP4YLoT65yMfXSf6XwI+E+K7s3JwCL6f7/YINErvwiZtYb0bX8tRZfUA82OZ2OVzna+HRE7\n1Vy5hUnajOKax+ERcUuz49nY+czDWoqkg9JF1tEU1wGWUNzZZYNE0ihJ75c0UtIEiq6kq5sdVyMk\nvU/S2HTtqPt618Imh7VJcPKwVnMwxYXqPwOTKLrAfHo8uERxa/UTFN1W91DcJj0cvR34I0W32EEU\ndwk+09yQNg3utjIzs2w+8zAzs2zNHlCtMuPGjYuJEyc2VPepp55i9OjRgxvQIHBceRxXHseVZ2ON\na/HixY9GxPY1V2z2v7hX9Zo6dWo06pZbbmm4bpUcVx7Hlcdx5dlY4wIWhYcnMTOzKjh5mJlZNicP\nMzPL5uRhZmbZnDzMzCybk4eZmWVz8jAzs2xOHmZmls3Jw8zMsm20w5OYmTXTxNOua8p+L54xNEOm\n+MzDzMyyOXmYmVk2Jw8zM8vm5GFmZtmcPMzMLJuTh5mZZXPyMDOzbE4eZmaWzcnDzMyyOXmYmVm2\nypKHpB0l3SJpmaSlkk5K5WdKWinpzvR6f6nO6ZKWS7pP0vtK5VMlLUnLzpGkquI2M7PaqhzbagNw\nSkTcIWkbYLGkG9Oyr0fEV8srS5oMzAR2A14H3CTpTRHxPHA+cCzwW+B6YAZwQ4Wxm5lZPyo784iI\nVRFxR5p+ErgHmNBPlYOBuRHxbEQ8ACwHpkkaD2wbEQsjIoBLgUOqitvMzGpT8Xlc8U6kicCtwO7A\nZ4CjgXXAIoqzkycknQssjIjLUp0LKc4uVgBnRcR+qXxf4NSIOLCX/cwCZgG0t7dPnTt3bkPxdnV1\n0dbW1lDdKjmuPI4rj+PKUyuuJSvXDWE0L9l5zIgBtdf06dMXR0RHrfUqH5JdUhvwQ+DkiFgv6Xzg\nS0Ckn2cDHx+MfUXEHGAOQEdHR3R2dja0nQULFtBo3So5rjyOK4/jylMrrqOaOCT7ULRXpXdbSdqc\nInFcHhE/AoiI1RHxfES8AHwHmJZWXwnsWKq+QypbmaZ7lpuZWZNUebeVgAuBeyLia6Xy8aXVDgXu\nTtPzgZmStpS0MzAJuD0iVgHrJe2VtnkEcE1VcZuZWW1VdlvtA3wMWCLpzlR2BnCYpD0ouq1WAMcB\nRMRSSfOAZRR3ap2Q7rQCOB64GBhFcR3Ed1qZmTVRZckjIm4Devt/jOv7qTMbmN1L+SKKi+1mZtYC\n/B/mZmaWzcnDzMyyOXmYmVk2Jw8zM8vm5GFmZtmcPMzMLJuTh5mZZXPyMDOzbE4eZmaWzcnDzMyy\nOXmYmVk2Jw8zM8vm5GFmZtmcPMzMLFvlj6E1M5s4gEeynjJlQ8OPdF1x1gEN79f65zMPMzPL5uRh\nZmbZnDzMzCybk4eZmWVz8jAzs2xOHmZmls3Jw8zMsjl5mJlZNicPMzPL5uRhZmbZnDzMzCybk4eZ\nmWVz8jAzs2xOHmZmlq2y5CFpR0m3SFomaamkk1L5qyTdKOn+9HO7Up3TJS2XdJ+k95XKp0pakpad\nI0lVxW1mZrVVeeaxATglIiYDewEnSJoMnAbcHBGTgJvTPGnZTGA3YAZwnqQRaVvnA8cCk9JrRoVx\nm5lZDZUlj4hYFRF3pOkngXuACcDBwCVptUuAQ9L0wcDciHg2Ih4AlgPTJI0Hto2IhRERwKWlOmZm\n1gQqPo8r3ok0EbgV2B34n4gYm8oFPBERYyWdCyyMiMvSsguBG4AVwFkRsV8q3xc4NSIO7GU/s4BZ\nAO3t7VPnzp3bULxdXV20tbU1VLdKjiuP48pTZVxLVq5ruG77KFj9TGN1p0wY0/B+a6nVXgM55oHY\necyIAf0ep0+fvjgiOmqtV/ljaCW1AT8ETo6I9eXLFRERkgYte0XEHGAOQEdHR3R2dja0nQULFtBo\n3So5rjyOK0+VcTX6GFkoHkN79pLGPqpWHN7Z8H5rqdVeAznmgbh4xugheX9VereVpM0pEsflEfGj\nVLw6dUWRfq5J5SuBHUvVd0hlK9N0z3IzM2uSKu+2EnAhcE9EfK20aD5wZJo+ErimVD5T0paSdqa4\nMH57RKwC1kvaK23ziFIdMzNrgiq7rfYBPgYskXRnKjsDOAuYJ+kY4EHgwwARsVTSPGAZxZ1aJ0TE\n86ne8cDFwCiK6yA3VBi3mZnVUFnyiIjbgL7+H+M9fdSZDczupXwRxcV2MzNrAf4PczMzy+bkYWZm\n2Zw8zMwsm5OHmZllc/IwM7NsTh5mZpbNycPMzLI5eZiZWTYnDzMzy+bkYWZm2Zw8zMwsm5OHmZll\nc/IwM7NsTh5mZpbNycPMzLI5eZiZWba6koekKVUHYmZmw0e9Zx7nSbpd0vGSxlQakZmZtby6kkdE\n7AscDuwILJb0fUn/WGlkZmbWsuq+5hER9wOfA04F3gWcI+leSf9UVXBmZtaa6r3m8RZJXwfuAd4N\nHBQRf5emv15hfGZm1oJG1rnet4DvAmdExDPdhRHxZ0mfqyQyMzNrWfUmjwOAZyLieQBJmwFbRcTT\nEfG9yqIzM7OWVO81j5uAUaX5rVOZmZltgupNHltFRFf3TJreupqQzMys1dWbPJ6StGf3jKSpwDP9\nrG9mZhuxeq95nAz8QNKfAQGvBT5SWVRmZtbS6koeEfE7SW8Gdk1F90XE36oLy8zMWlm9Zx4A/wBM\nTHX2lEREXFpJVGZm1tLqSh6SvgfsAtwJPJ+KA3DyMDPbBNV75tEBTI6IqHfDki4CDgTWRMTuqexM\n4FjgkbTaGRFxfVp2OnAMRXL6VET8LJVPBS6muFX4euCknDjMzGzw1Xu31d0UF8lzXAzM6KX86xGx\nR3p1J47JwExgt1TnPEkj0vrnUyScSenV2zbNzGwI1XvmMQ5YJul24Nnuwoj4QF8VIuJWSRPr3P7B\nwNyIeBZ4QNJyYJqkFcC2EbEQQNKlwCHADXVu18zMKqB6eoAkvau38oj4RY16E4Fre3RbHQ2sAxYB\np0TEE5LOBRZGxGVpvQspEsQK4KyI2C+V7wucGhEH9rG/WcAsgPb29qlz586teWy96erqoq2traG6\nVXJceRxXnirjWrJyXcN120fB6gb/q2zKhOoeP1SrvQZyzAOx85gRA/o9Tp8+fXFEdNRar95bdX8h\naSdgUkTcJGlrYESter04H/gSxcX2LwFnAx9vYDt9xTkHmAPQ0dERnZ2dDW1nwYIFNFq3So4rj+PK\nU2VcR512XcN1T5mygbOX5NwY+pIVh3c2vN9aarXXQI55IC6eMXpI3l/1Dsl+LHAVcEEqmgD8OHdn\nEbE6Ip6PiBeA7wDT0qKVFA+a6rZDKluZpnuWm5lZE9V7wfwEYB9gPbz4YKjX5O5M0vjS7KEUF+IB\n5gMzJW0paWeKC+O3R8QqYL2kvSQJOAK4Jne/ZmY2uOo9F3w2Ip4rPr9B0kiKrqc+SboC6ATGSXoY\n+ALQKWmPVHcFcBxARCyVNA9YBmwATuge/h04npdu1b0BXyw3M2u6epPHLySdAYxKzy4/HvhJfxUi\n4rBeii/sZ/3ZwOxeyhcBu9cZp5mZDYF6u61Oo/jHviUUZwvXUzzP3MzMNkH13m3VfYH7O9WGY2Zm\nw0G9Y1s9QC/XOCLiDYMekZmZtbycsa26bQV8CHjV4IdjZmbDQV3XPCLisdJrZUR8Azig4tjMzKxF\n1dtttWdpdjOKM5HG/uXTzMyGvXoTwNml6Q0U/6Px4UGPxszMhoV677aaXnUgZmY2fNTbbfWZ/pZH\nxNcGJxwzMxsOcu62+geKMagADgJuB+6vIigzM2tt9SaPHYA9I+JJePG5HNdFxEerCszMzFpXvcOT\ntAPPleafS2VmZrYJqvfM41LgdklXp/lDgEuqCcnMzFpdvXdbzZZ0A7BvKjo6In5fXVhmZtbK6u22\nAtgaWB8R3wQeTg9tMjOzTVC9j6H9AnAqcHoq2hy4rKqgzMystdV75nEo8AHgKYCI+DOwTVVBmZlZ\na6s3eTwXEUEall3S6OpCMjOzVldv8pgn6QJgrKRjgZvwg6HMzDZZ9d5t9dX07PL1wK7A5yPixkoj\nMzOzllUzeUgaAdyUBkd0wjAzs9rdVhHxPPCCpDFDEI+ZmQ0D9f6HeRewRNKNpDuuACLiU5VEZWZm\nLa3e5PGj9DIzM+s/eUh6fUT8T0R4HCszM3tRrWseP+6ekPTDimMxM7NholbyUGn6DVUGYmZmw0et\n5BF9TJuZ2Sas1gXzt0paT3EGMipNk+YjIratNDozM2tJ/Z55RMSIiNg2IraJiJFpunu+38Qh6SJJ\nayTdXSp7laQbJd2ffm5XWna6pOWS7pP0vlL5VElL0rJzJKnnvszMbGjlPM8j18XAjB5lpwE3R8Qk\n4OY0j6TJwExgt1TnvPSf7QDnA8cCk9Kr5zbNzGyIVZY8IuJW4PEexQfz0uNrL6F4nG13+dyIeDYi\nHgCWA9MkjQe2jYiFaVTfS0t1zMysSao88+hNe0SsStN/AdrT9ATgodJ6D6eyCWm6Z7mZmTWRii/0\nFW1cmghcGxG7p/m1ETG2tPyJiNhO0rnAwoi4LJVfCNwArADOioj9Uvm+wKkRcWAf+5sFzAJob2+f\nOnfu3Ibi7urqoq2traG6VXJceVo1rjWPr2P1M83Z95QJfQ9RV2V7LVm5ruG67aNouL36O96BqtVe\nAznmgdh5zIgB/R6nT5++OCI6aq1X7/Akg2W1pPERsSp1Sa1J5SuBHUvr7ZDKVqbpnuW9iog5wByA\njo6O6OzsbCjIBQsW0GjdKjmuPK0a17cuv4azlwz1n15hxeGdfS6rsr2OOu26huueMmVDw+3V3/EO\nVK32GsgxD8TFM0YPyft+qLut5gNHpukjgWtK5TMlbSlpZ4oL47enLq71kvZKd1kdUapjZmZNUtnX\nH0lXAJ3AOEkPA18AzqJ4KuExwIPAhwEiYqmkecAyYANwQhoKHuB4iju3RlF0Zd1QVcxmZlafypJH\nRBzWx6L39LH+bGB2L+WLgN0HMTQzMxugoe62MjOzjYCTh5mZZXPyMDOzbE4eZmaWzcnDzMyyOXmY\nmVk2Jw8zM8vm5GFmZtmcPMzMLJuTh5mZZXPyMDOzbE4eZmaWzcnDzMyyOXmYmVk2Jw8zM8vm5GFm\nZtmcPMzMLJuTh5mZZXPyMDOzbE4eZmaWzcnDzMyyOXmYmVk2Jw8zM8vm5GFmZtmcPMzMLJuTh5mZ\nZXPyMDOzbE4eZmaWzcnDzMyyOXmYmVm2piQPSSskLZF0p6RFqexVkm6UdH/6uV1p/dMlLZd0n6T3\nNSNmMzN7STPPPKZHxB4R0ZHmTwNujohJwM1pHkmTgZnAbsAM4DxJI5oRsJmZFVqp2+pg4JI0fQlw\nSKl8bkQ8GxEPAMuBaU2Iz8zMEkXE0O9UegBYBzwPXBARcyStjYixabmAJyJirKRzgYURcVladiFw\nQ0Rc1ct2ZwGzANrb26fOnTu3ofi6urpoa2trqG6VHFeeVo1rzePrWP1Mc/Y9ZcKYPpdV2V5LVq5r\nuG77KBpur/6Od6BqtddAjnkgdh4zYkC/x+nTpy8u9Qj1aWTDexiYd0TESkmvAW6UdG95YUSEpOys\nFhFzgDkAHR0d0dnZ2VBwCxYsoNG6VXJceVo1rm9dfg1nL2nOn96Kwzv7XFZlex112nUN1z1lyoaG\n26u/4x2oWu01kGMeiItnjB6S931Tuq0iYmX6uQa4mqIbarWk8QDp55q0+kpgx1L1HVKZmZk1yZAn\nD0mjJW3TPQ28F7gbmA8cmVY7ErgmTc8HZkraUtLOwCTg9qGN2szMyppx7twOXF1c1mAk8P2I+Kmk\n3wHzJB0DPAh8GCAilkqaBywDNgAnRMTzTYjbzMySIU8eEfEn4K29lD8GvKePOrOB2RWHZmZmdWql\nW3XNzGyYcPIwM7NsTh5mZpbNycPMzLI5eZiZWTYnDzMzy9as4UmsxUwc4PARjQ7FsOKsAxrer5k1\nj888zMwsm5OHmZllc/IwM7NsTh5mZpbNycPMzLI5eZiZWTYnDzMzy+bkYWZm2Zw8zMwsm5OHmZll\nc/IwM7NsTh5mZpbNycPMzLI5eZiZWTYnDzMzy+bkYWZm2Zw8zMwsm5OHmZll82Noe7Fk5bqGH6s6\nEH4kq5kNFz7zMDOzbE4eZmaWzcnDzMyyOXmYmVm2YZM8JM2QdJ+k5ZJOa3Y8ZmabsmGRPCSNAP4L\n2B+YDBwmaXJzozIz23QNi+QBTAOWR8SfIuI5YC5wcJNjMjPbZCkimh1DTZI+CMyIiE+k+Y8Bb4uI\nE3usNwuYlWZ3Be5rcJfjgEcbrFslx5XHceVxXHk21rh2iojta620Uf2TYETMAeYMdDuSFkVExyCE\nNKgcVx7Hlcdx5dnU4xou3VYrgR1L8zukMjMza4Lhkjx+B0yStLOkLYCZwPwmx2RmtskaFt1WEbFB\n0onAz4ARwEURsbTCXQ6466sijiuP48rjuPJs0nENiwvmZmbWWoZLt5WZmbUQJw8zM8u2ySYPSRdJ\nWiPp7j6WS9I5aTiUuyTt2SJxdUpaJ+nO9Pr8EMW1o6RbJC2TtFTSSb2sM+RtVmdcQ95mkraSdLuk\nP6S4vtjLOs1or3riasp7LO17hKTfS7q2l2VN+ZusI65m/U2ukLQk7XNRL8urba+I2CRfwDuBPYG7\n+1j+fuAGQMBewG9bJK5O4NomtNd4YM80vQ3w38DkZrdZnXENeZulNmhL05sDvwX2aoH2qieuprzH\n0r4/A3y/t/0362+yjria9Te5AhjXz/JK22uTPfOIiFuBx/tZ5WDg0igsBMZKGt8CcTVFRKyKiDvS\n9JPAPcCEHqsNeZvVGdeQS23QlWY3T6+ed6c0o73qiaspJO0AHAB8t49VmvI3WUdcrarS9tpkk0cd\nJgAPleYfpgU+lJK902noDZJ2G+qdS5oI/D3Ft9ayprZZP3FBE9osdXXcCawBboyIlmivOuKC5rzH\nvgF8Fnihj+XNen/Vigua014B3CRpsYqhmXqqtL2cPIafO4DXR8RbgG8BPx7KnUtqA34InBwR64dy\n3/2pEVdT2iwino+IPShGRJgmafeh2G8tdcQ15O0l6UBgTUQsrnpfOeqMq1l/k+9Iv8f9gRMkvXOI\n9gs4efSnJYdEiYj13d0OEXE9sLmkcUOxb0mbU3xAXx4RP+pllaa0Wa24mtlmaZ9rgVuAGT0WNfU9\n1ldcTWqvfYAPSFpBMWr2uyVd1mOdZrRXzbia9f6KiJXp5xrgaorRx8sqbS8nj77NB45IdyzsBayL\niFXNDkrSayUpTU+j+B0+NgT7FXAhcE9EfK2P1Ya8zeqJqxltJml7SWPT9CjgH4F7e6zWjPaqGVcz\n2isiTo+IHSJiIsXwQz+PiI/2WG3I26ueuJr0/hotaZvuaeC9QM87NCttr2ExPEkVJF1BcZfEOEkP\nA1+guHhIRHwbuJ7iboXlwNPA0S0S1weBf5G0AXgGmBnp1oqK7QN8DFiS+ssBzgBeX4qtGW1WT1zN\naLPxwCUqHmS2GTAvIq6V9MlSXM1or3riatZ77BVaoL3qiasZ7dUOXJ1y1kjg+xHx06FsLw9PYmZm\n2dxtZWZm2Zw8zMwsm5OHmZllc/IwM7NsTh5mZpbNycM2SpKi/M9ckkZKekS9jIrawLZ/PdBt1Nj+\nUSnWOyXdK+nTddTplLR3af6Tko6oMk7btG2y/+dhG72ngN0ljYqIZyj+GS7rv2sljYyIDT3nI2Lv\n/uoNkisj4kRJrwbuk3RVRDzUz/qdQBfwa3jxPn+zyvjMwzZm11OMhgpwGHBF9wJJ0yT9RsUzGn4t\naddUfpSk+ZJ+DtycvtH/UtJ8YFlapyv9bJN0s6Q7VDxX4eDS9v9N0n2SbpN0haR/TeW7SPqpisHs\nfinpzf0dQEQ8RvFPXuNT/YMk/TbFfZOkdhUDQn4S+HQ6W9lX0pmlfS6Q9BUVz/H4b0n7pvKtJc1T\n8SyUq9N2OwbY5raJ8JmHbczmAp9PXVVvAS4C9k3L7gX2jYgNkvYDvgz8c1q2J/CWiHhcUmea3z0i\nHuix/b8Ch0bEehVjGS1MSaYjbeutFKMD3AF0D6w3B/hkRNwv6W3AecC7+zoASa8HtgLuSkW3UTx/\nIyR9AvhsRJwi6dtAV0R8NdV7T49NjYyIaZLeTzFqwX7A8cATETFZxeCId2JWJycP22hFxF3pW/lh\nFGchZWMohumYRDG09ealZTdGRPmZKrf3kjigeMjOl1WMZvoCxXDX7RRDplwTEX8F/irpJ/DiyL97\nAz9Iw0oAbNlH+B9J230zcGLaFhSD212p4rkMWwC9xdWb7gEjFwMT0/Q7gG8CRMTdku7qpZ5Zr9xt\nZRu7+cBXKXVZJV8CbomI3YGDKL7dd3uqx7o957sdDmwPTE1DY6/usZ2eNgPWRsQepdff9bHulWmI\n772BsyS9NpV/Czg3IqYAx9XYX9mz6efz+EujDQInD9vYXQR8MSKW9Cgfw0sX0I9qcNtjKJ718DdJ\n04GdUvmvgINUPC+8DTgQiqG7gQckfQhefMb0W/vbQUQsAr4HdD+bvRz3kaVVn6R4DG+OXwEfTrFM\nBqZk1rdNmJOHbdQi4uGIOKeXRf8J/Iek39P4N/HLgQ5JS4AjSEObR8TvKM547qJ4hvQSYF2qczhw\njKQ/AEttR5HKAAAAnElEQVQpHhVay1eAo1UMwX0mRbfXYuDR0jo/AQ7tvmBeZ/znAdtLWgb8e4pn\nXf9VzAoeVdesApLaIqJL0tbArcCsSM9abxVpWPbNI+KvknYBbgJ2jYjnmhyaDQPu+zSrxpzUFbQV\ncEmrJY5ka+AWFU9iFHC8E4fVy2ceZmaWzdc8zMwsm5OHmZllc/IwM7NsTh5mZpbNycPMzLL9fzEs\n/8/K5DQyAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0xbd31908>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# histogram of marriage rating\n",
    "dta.rate_marriage.hist()\n",
    "plt.title('Histogram of Marriage Rating')\n",
    "plt.xlabel('Marriage Rating')\n",
    "plt.ylabel('Frequency')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take a look at the distribution of marriage ratings for those having affairs versus those not having affairs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.text.Text at 0xbeeb080>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAAEcCAYAAAA/aDgKAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xu8VXWd//HXm4sCYngDRFEPGqPgjfCIlDphWpJ3bSYl\nG7Xxfkmb0V9ZYxNNWTYPJ0pHU8wbqCk6eWnUGtG0tBDBULxgkmKAgIgKYtz5/P5Y3wObw7nsBfty\nztnv5+NxHmet71rruz77u/den72+37XXVkRgZmaWR6dqB2BmZu2Pk4eZmeXm5GFmZrk5eZiZWW5O\nHmZmlpuTh5mZ5ebk0YFI2lXSUkmdqx3L5pL0sqQRVdp3SdtR0g2Svp2mR0iaU4p6U32HSnqtVPUV\n1FsnKSR1KXXdzezvfEkLUrtvL+lgSa+n+RNa2bYsbWAtc/IoM0mzJK2UtEOj8j+lN2ddqfYVEX+N\niJ4RsaZUdRZD0hmS1qQ3+hJJL0g6Jsf2t0n6fmFZROwdEU+WOdalkt6UdKukvyvYd1HtmOp6urV9\nRsR5EfG9EsUfkj5eUPfvI2LPUtRdbqm9QtLJjcq7Aj8GPpfafRHwH8B/p/kHWqp3U9pA0vGSpqXX\n67uSnpA0IC0bLemOHHWV9ANBe+HkURlvAqMaZiTtC/TY1Mqa+jRYqU+ILfhjRPQEtgGuB+6WtE2V\nY2pOQ6y9gCOAZcBUSfuUekcd4SywhE4H3gNOa1TeF+gGvFxQtluj+U3SzHvl48A44FKy18AA4Dqg\noh+62r2I8F8Z/4BZwBXAcwVlVwP/BgRQl8qOBv4ELAFmA6ML1q9L654J/BX4XStlXdJ2XwFeBT4E\n3gDObRTb14F5wNvAWWnbj6dlW6Y4/wosAG4AujfzGM8Ani6Y75HqOrCg7F5gPrA4xbp3Kj8HWAWs\nBJYCvypotyPS9GhgAtkb/kOyg0p9Qd1DU9t9mPZzD/D9YmItKP9f4L5G7d2lYJs3Uv1vAqcCg4Dl\nZAecpcAHad3bgJ8BjwAfkSWn2xriAUYAc4BvAe+mx3lqQRxPAmc1FW9qt0j1LgVObqivYP1BqY4P\nUjsdV7DsNrKD5MPpsTwL7NFMOzW0wTlkr495wGVp2Y7A34DtGz0HC4GuzdS3G7AW+AKwGtgxlf9d\nejyRHtMTwF/SustS2Za08Fpuog1mAd8AXgRWNDyPBcv/AZjWTJwjyV6Lq9K+X2jpvQRsleJcm9Zf\nCuxU+Jw3E+M3gLmpvteAw6t9rMp9bKt2AB39L72Qj0gvkEFAZ7KDx25smDxGAPuSnQ3uR3bAPiEt\na3gjj0sv1u6tlDUc9I4G9gAEfDq94YemZSPJDuZ7kx3s72DD5DEGeAjYDtga+BXww2Ye4xmsP8B1\nBi5Mb8A+Bev8c6pnS+AnhW/exm+0wnZL06PJDtRHpfp/CExKy7YA3gIuAboCJ6V9500e/wwsaNTe\nXVLbLgH2TMv6sT7xbVRXeiyLgYPTc9mNjZPHarJumi3T8/JRQf1P0kzySPPrnqOC+uak6a7ATLLE\ntAXwGbKD054FsS0ChqXHdidwdzPt1NAGv0htsC9Zcmh4Th4Bzi9YfwxwbQvvg28Dk9P0dODSJvbV\npaBs3fNfxGt5XRsUbDsN2IUmPvAAu5O9nsYAhwE9Gy0fDdzRqKzo/Tf1mm70PO1J9gFxp4LH32QS\nb8t/7raqnPFkp+ufJfsEM7dwYUQ8GRHTI2JtRLxI9qb9dKM6RkfERxGxrJWyhjofjoi/ROYp4P+A\nQ9PiLwK3RsTLEfE3sjcMAJJE9onzXyLivYj4EPgBcEoLj2+4pA/I3pRXA1+OiHcKYrklIj6MiBVp\nX/tL6tVCfY09HRGPRDYOMR7Yv2G/ZAfCayJiVUT8Epico94Gb5MlyqasBfaR1D0i5kVEa90pD0bE\nM+m5XN7MOt+OiBXpeXmY7PnYXMOBnsBVEbEyIp4gO6MaVbDO/RExOSJWkyWPIa3U+d30+poO3FpQ\n1+3Al2Fd19wosuelOacBd6Xpu9i466pFrbyWm3JNRMxu5n3xBtnBfGeyM9p307hbzxLuvyVryD44\nDJbUNSJmRcRfNrGuqnHyqJzxwJfIPkmOa7xQ0kGSfitpoaTFwHnADo1Wm91EvU2VNdT5eUmTJL2X\nDuxHFdS5U6NtC6d7k52NTJX0Qdr216m8OZMiYhtgW7IzlnVvLEmdJV0l6S+SlpB9MqSJx9eS+QXT\nfwO6pf7snYC5kT7CNfFYirUzWX/8BiLiI7LuofOAeZIelrRXK3W1tv/3U70N3iJ7HJtrJ2B2RKxt\nVPfOBfON27HZA2ZS+FgK43yQ7OA3gOwD0eKIaDJpSzqYbFzh7lR0F7CvpNYSV2EdLb2WW4t7IxEx\nKSK+GBG9yV6rf0/WlVyq/be075nA18g+RL0j6W5JpXj+K8rJo0Ii4i2y/vKjgF82scpdZAfdXSKi\nF9kYgxpX01TVTe1P0pbA/5CdBfRNB/ZHCuqcB/Qv2GSXgul3yfpx946IbdJfr8gGmVsUEUuB84F/\nkvSJVPwl4Hiy7rteZKfpFMSyObd2ngfsnM6WGuzS3MotOBH4fVMLIuI3EfFZsi6rGcBNDYuaqau1\nx7OtpK0K5nclO/OBrAur8GKKHVupq9DbwC6SCt/Xu9LoLDenwrZcF2c6o5pAdvbxT7R81nE62XM9\nTdJ8srGWhvJWFfFabkrRr6mIeI7sPdlwwcQG2xax/6b21eLzGBF3RcQhrO++/lGx8bYVTh6VdSbw\nmUafOhtsDbwXEcslDSM74G6OLchOjRcCqyV9HvhcwfIJwFckDZLUg6xPGoD0yfUmYIykPgCSdpZ0\nZDE7joj3gJ8D/56KtiYbuFxE9ob6QaNNFpD1Q2+KP5J1A1wkqYuk48n69FuVzogGSLqWrBvju02s\n0zdd1rlVegxLybqxGuLuL2mLTYj7u5K2kHQocAzZQD9kffUnSeqRrgo6s9F2LbXVs2RnE1+X1FXZ\n92SOZf0n/k3x7RTL3mSDxvcULBtHdiZ9HM0kD0ndyLrkziHrImv4+yrwpSKvEmzttZyLpEMknV3w\n2t4rPYZJaZUFQF1BEm5t/wuA7Rt1w04DjpK0naQdyc40Gva/p6TPpKS0nPUD7u2Kk0cFpT7TKc0s\nvgD4D0kfkh10J2zmvj4ELk71vE+WjB4qWP4ocA3wW7JB1oY3zor0/xsN5amraSLZQF+xfkL25tmP\n7CDzFtkn4FcK9tXgZrIukA8ktXhNf2MRsZJskPxMsiuMvkzWz7+ihc0+KWkp2UD4k8DHyK4Mm97E\nup2AfyX7xP0e2TjU+WnZE2RXNM2X9G6OsOeTPSdvk407nBcRM9KyMWQD/gvIxhXubLTtaOD21FYb\njJOktjgW+DzZ2eP1wGkFdW+Kp8heB48DV0fE/xXs7xmyg97z6cy6KSeQHRzHRcT8hj/gFrKxqpGt\nBdDaa3kTfECWLKan18GvgfuB/0zLGxL5IknPF/FemkE2RvlGel52IkumL5B10f4fGybdLYGryJ6j\n+UAf4Jub8XiqQht2FVutkjQIeAnYMg2mtluSngVuiIhbqx1LRyfpCeCuiPh5tWOxyvKZRw2TdKKk\nLSVtS9bn+qv2mDgkfVrSjqnb6nSyS51/Xe24OjpJB5J9v+Oe1ta1jsfJo7adC7xD9qWsNazvjmlv\n9iTrIviA7FvD/xAR86obUscm6XayrsyvpW4dqzHutjIzs9x85mFmZrlV+2Z6ZbPDDjtEXV1dtcMw\nM2tXpk6d+m768mSLOmzyqKurY8qU5q6KNTOzpkhq7rLrDbjbyszMcnPyMDOz3Jw8zMwstw475tGU\nVatWMWfOHJYvb+4u2R1Pt27d6N+/P127dq12KGbWgdRU8pgzZw5bb701dXV1bHgT1o4pIli0aBFz\n5sxhwIAB1Q7HzDqQmuq2Wr58Odtvv31NJA4ASWy//fY1daZlZpVRU8kDqJnE0aDWHq+ZVUbNJQ8z\nM9t8Th4ldO+99zJo0CAOO+wwAEaNGsV+++3HmDFjmt3mhhtuYNy4jX6V1sysTaupAfNyu/nmm7np\npps45JBDmD9/Ps899xwzZ85scZvzzjuvyfLVq1fTpYufHrP2oO7yhze7jllXHV2CSCrHR6dNdMIJ\nJzB79myWL1/OJZdcwvz583n66ac588wzOe644/jNb37D3LlzGTJkCNdeey0zZsxg7NixrFy5ko9/\n/OOMHz+eHj16MHr0aHr27Mlll13GiBEjGDJkCE8//TSjRo3i0ksvrfbDNDNrkpPHJrrlllvYbrvt\nWLZsGQceeCBPPfUUTzzxBFdffTX19fVceOGFHHPMMUybNg2AwYMHc/bZZwNwxRVXcPPNN/PVr351\no3pXrlzpe3KZWZvn5LGJrrnmGu6//34AZs+ezeuvv97i+i+99BJXXHEFH3zwAUuXLuXII49scr2T\nTz655LGamZWak8cmePLJJ5k4cSJ//OMf6dGjByNGjGj1uxRnnHEGDzzwAPvvvz+33XYbTz75ZJPr\nbbXVVmWI2MystHy11SZYvHgx2267LT169GDGjBlMmjSp1W0+/PBD+vXrx6pVq7jzzjsrEKWZWfk4\neWyCkSNHsnr1agYNGsTll1/O8OHDW93me9/7HgcddBAHH3wwe+21VwWiNDMrnw77G+b19fXReOD5\n1VdfZdCgQVWKqHpq9XGbVUpHulRX0tSIqG9tPZ95mJlZbk4eZmaWm5OHmZnl5uRhZma5OXmYmVlu\nTh5mZpZbTX/DvBSX1xUq5lK7zp07s++++66bf+CBB6irq2u6vlmzOOaYY3jppZdKFaKZWUmULXlI\n2gUYB/QFAhgbET+VtB1wD1AHzAK+GBHvp22+CZwJrAEujojfpPIDgNuA7sAjwCXRTr+g0r1793U3\nSzQza6/K2W21Grg0IgYDw4ELJQ0GLgcej4iBwONpnrTsFGBvYCRwvaTOqa6fAWcDA9PfyDLGXXGz\nZs3i0EMPZejQoQwdOpQ//OEPG63z8ssvM2zYMIYMGcJ+++237kaMd9xxx7ryc889lzVr1lQ6fDOr\nQWVLHhExLyKeT9MfAq8COwPHA7en1W4HTkjTxwN3R8SKiHgTmAkMk9QP+FhETEpnG+MKtml3li1b\nxpAhQxgyZAgnnngiAH369OGxxx7j+eef55577uHiiy/eaLsbbriBSy65hGnTpjFlyhT69+/Pq6++\nyj333MMzzzzDtGnT6Ny5s++bZWYVUZExD0l1wCeAZ4G+ETEvLZpP1q0FWWIpvMPgnFS2Kk03Lm9q\nP+cA5wDsuuuupQm+xJrqtlq1ahUXXXTRugTw5z//eaPtPvnJT3LllVcyZ84cTjrpJAYOHMjjjz/O\n1KlTOfDAA4EsMfXp06cij8PMalvZk4eknsD/AF+LiCWS1i2LiJBUsrGLiBgLjIXs3lalqrfcxowZ\nQ9++fXnhhRdYu3Yt3bp122idL33pSxx00EE8/PDDHHXUUdx4441EBKeffjo//OEPqxC1mdWysl6q\nK6krWeK4MyJ+mYoXpK4o0v93UvlcYJeCzfunsrlpunF5h7F48WL69etHp06dGD9+fJPjFm+88Qa7\n7747F198Mccffzwvvvgihx9+OPfddx/vvJM14Xvvvcdbb71V6fDNrAaV82orATcDr0bEjwsWPQSc\nDlyV/j9YUH6XpB8DO5ENjE+OiDWSlkgaTtbtdRpwbSlibCt3sbzgggv4whe+wLhx4xg5cmSTPwg1\nYcIExo8fT9euXdlxxx351re+xXbbbcf3v/99Pve5z7F27Vq6du3Kddddx2677VaFR2FmtaRst2SX\ndAjwe2A6sDYVf4ssAUwAdgXeIrtU9720zb8B/0x2pdbXIuLRVF7P+kt1HwW+2tqlur4l+3q1+rjN\nKqUWb8letjOPiHgaUDOLD29mmyuBK5sonwLsU7rozMxsc/j2JGZmlpuTh5mZ5ebkYWZmuTl5mJlZ\nbk4eZmaWW03fkp3RvUpc3+IWFy9atIjDD88uNJs/fz6dO3emd+/eAEyePJktttiitPGYmZVJbSeP\nCtt+++3X3ddq9OjR9OzZk8suu2yDdSKCiKBTJ58Umlnb5SNUGzBz5kwGDx7Mqaeeyt57783s2bPZ\nZptt1i2/++67OeusswBYsGABJ510EvX19QwbNoxJkyY1V62ZWdn4zKONmDFjBuPGjaO+vp7Vq1c3\nu97FF1/M17/+dYYPH+5fGjSzqnHyaCP22GMP6utbvSMAEydO5LXXXls3//7777Ns2TK6d+9ezvDM\nzDbg5NFGFN4MsVOnThTeumv58uXrpiPCg+tmVnUe82iDOnXqxLbbbsvrr7/O2rVruf/++9ctO+KI\nI7juuuvWzfv30M2sGmr7zKOVS2ur6Uc/+hFHHnkkffr04YADDmDFihUAXHfddZx//vnceuutrF69\nmsMOO2yDZGJmVglluyV7tfmW7OvV6uM2q5RavCW7u63MzCw3Jw8zM8ut5pJHR+2ma06tPV4zq4ya\nSh7dunVj0aJFNXNAjQgWLVpEt27dqh2KmXUwNXW1Vf/+/ZkzZw4LFy6sdigV061bN/r371/tMMys\ng6mp5NG1a1cGDBhQ7TDMzNq9muq2MjOz0nDyMDOz3Jw8zMwsNycPMzPLzcnDzMxyc/IwM7PcnDzM\nzCw3Jw8zM8vNycPMzHJz8jAzs9ycPMzMLDcnDzMzy83Jw8zMcnPyMDOz3Jw8zMwsNycPMzPLzcnD\nzMxyc/IwM7PcypY8JN0i6R1JLxWUjZY0V9K09HdUwbJvSpop6TVJRxaUHyBpelp2jSSVK2YzMytO\nOc88bgNGNlE+JiKGpL9HACQNBk4B9k7bXC+pc1r/Z8DZwMD011SdZmZWQV3KVXFE/E5SXZGrHw/c\nHRErgDclzQSGSZoFfCwiJgFIGgecADxa+ojNLI+6yx/e7DpmXXV0CSKxaqjGmMdXJb2YurW2TWU7\nA7ML1pmTynZO043LmyTpHElTJE1ZuHBhqeM2M7Ok0snjZ8DuwBBgHvBfpaw8IsZGRH1E1Pfu3buU\nVZuZWYGKJo+IWBARayJiLXATMCwtmgvsUrBq/1Q2N003LjczsyqqaPKQ1K9g9kSg4Uqsh4BTJG0p\naQDZwPjkiJgHLJE0PF1ldRrwYCVjNjOzjZVtwFzSL4ARwA6S5gDfAUZIGgIEMAs4FyAiXpY0AXgF\nWA1cGBFrUlUXkF251Z1soNyD5WZmVVbOq61GNVF8cwvrXwlc2UT5FGCfEoZmZmabyd8wNzOz3Jw8\nzMwsNycPMzPLrajkIWnfcgdiZmbtR7FnHtdLmizpAkm9yhqRmZm1eUUlj4g4FDiV7It8UyXdJemz\nZY3MzMzarKLHPCLideAK4BvAp4FrJM2QdFK5gjMzs7ap2DGP/SSNAV4FPgMcGxGD0vSYMsZnZmZt\nULFfErwW+DnwrYhY1lAYEW9LuqIskZmZWZtVbPI4GljWcMsQSZ2AbhHxt4gYX7bozMysTSp2zGMi\n2b2lGvRIZWZmVoOKTR7dImJpw0ya7lGekMzMrK0rNnl8JGlow4ykA4BlLaxvZmYdWLFjHl8D7pX0\nNiBgR+DkskVlZmZtWlHJIyKek7QXsGcqei0iVpUvLDMza8vy/J7HgUBd2maoJCJiXFmiMjOzNq2o\n5CFpPLAHMA1o+IW/AJw8zMxqULFnHvXA4IiIcgZjZmbtQ7FXW71ENkhuZmZW9JnHDsArkiYDKxoK\nI+K4skRlZmZtWrHJY3Q5gzAzs/al2Et1n5K0GzAwIiZK6gF0Lm9oZmbWVhV7S/azgfuAG1PRzsAD\n5QrKzMzatmIHzC8EDgaWwLofhupTrqDMzKxtKzZ5rIiIlQ0zkrqQfc/DzMxqULHJ4ylJ3wK6p98u\nvxf4VfnCMjOztqzY5HE5sBCYDpwLPEL2e+ZmZlaDir3aai1wU/ozM7MaV+y9rd6kiTGOiNi95BGZ\nmVmbl+feVg26Af8IbFf6cMzMrD0oaswjIhYV/M2NiJ8AR5c5NjMza6OK7bYaWjDbiexMJM9vgZiZ\nWQdSbAL4r4Lp1cAs4Islj8bMzNqFYq+2OqzcgZiZWftRbLfVv7a0PCJ+XJpwzMysPchztdWBwENp\n/lhgMvB6OYIyM7O2rdjk0R8YGhEfAkgaDTwcEV8uV2BmZtZ2FXt7kr7AyoL5lanMzMxqULHJYxww\nWdLodNbxLHB7SxtIukXSO5JeKijbTtJjkl5P/7ctWPZNSTMlvSbpyILyAyRNT8uukaRcj9DMzEqu\n2C8JXgl8BXg//X0lIn7Qyma3ASMblV0OPB4RA4HH0zySBgOnAHunba6X1PBLhT8DzgYGpr/GdZqZ\nWYUVe+YB0ANYEhE/BeZIGtDSyhHxO+C9RsXHs/6M5XbghILyuyNiRUS8CcwEhknqB3wsIiZFRJCd\nAZ2AmZlVVbE/Q/sd4BvAN1NRV+COTdhf34iYl6bns37cZGdgdsF6c1LZzmm6cbmZmVVRsWceJwLH\nAR8BRMTbwNabs+N0JlHSXyOUdI6kKZKmLFy4sJRVm5lZgWKTx8rCg72krTZxfwtSVxTp/zupfC6w\nS8F6/VPZ3DTduLxJETE2Iuojor53796bGKKZmbWm2OQxQdKNwDaSzgYmsmk/DPUQcHqaPh14sKD8\nFElbprGUgcDk1MW1RNLwdJXVaQXbmJlZlRR7b6ur02+XLwH2BP49Ih5raRtJvwBGADtImgN8B7iK\nLBGdCbxFurliRLwsaQLwCtmNFy+MiDWpqgvIrtzqDjya/szMrIpaTR7pktmJ6eaILSaMQhExqplF\nhzez/pXAlU2UTwH2KXa/ZmZWfq12W6UzgLWSelUgHjMzaweKvbfVUmC6pMdIV1wBRMTFZYnKzMza\ntGKTxy/Tn5mZWcvJQ9KuEfHXiGjxPlZmZlZbWhvzeKBhQtL/lDkWMzNrJ1pLHoV3sN29nIGYmVn7\n0VryiGamzcyshrU2YL6/pCVkZyDd0zRpPiLiY2WNzszM2qQWk0dEdG5puZmZ1aY8v+dhZmYGOHmY\nmdkmcPIwM7PcnDzMzCw3Jw8zM8vNycPMzHJz8jAzs9ycPMzMLDcnDzMzy83Jw8zMcnPyMDOz3Jw8\nzMwst2J/htbMrPRG9ypBHYs3vw7LzWceZmaWm5OHmZnl5uRhZma5OXmYmVluTh5mZpabk4eZmeXm\n5GFmZrk5eZiZWW5OHmZmlpuTh5mZ5ebkYWZmuTl5mJlZbk4eZmaWm5OHmZnl5uRhZma5+fc8zCrN\nv2FhHYDPPMzMLLeqJA9JsyRNlzRN0pRUtp2kxyS9nv5vW7D+NyXNlPSapCOrEbOZma1XzTOPwyJi\nSETUp/nLgccjYiDweJpH0mDgFGBvYCRwvaTO1QjYzMwybanb6njg9jR9O3BCQfndEbEiIt4EZgLD\nqhCfmZkl1UoeAUyUNFXSOamsb0TMS9Pzgb5pemdgdsG2c1LZRiSdI2mKpCkLFy4sR9xmZkb1rrY6\nJCLmSuoDPCZpRuHCiAhJkbfSiBgLjAWor6/Pvb2ZWdW0s6vwqnLmERFz0/93gPvJuqEWSOoHkP6/\nk1afC+xSsHn/VGZmZlVS8eQhaStJWzdMA58DXgIeAk5Pq50OPJimHwJOkbSlpAHAQGByZaM2M7NC\n1ei26gvcL6lh/3dFxK8lPQdMkHQm8BbwRYCIeFnSBOAVYDVwYUSsqULcZmaWVDx5RMQbwP5NlC8C\nDm9mmyuBK8scmlmr6i5/eLPrmNWtBIGYVVlbulTXzMzaCScPMzPLzcnDzMxyc/IwM7PcnDzMzCw3\nJw8zM8vNycPMzHJz8jAzs9ycPMzMLDcnDzMzy83Jw8zMcnPyMDOz3Jw8zMwsNycPMzPLzcnDzMxy\nc/IwM7PcnDzMzCw3Jw8zM8vNycPMzHJz8jAzs9y6VDsAqxGje5WgjsWbX4eZlYTPPMzMLDcnDzMz\ny83Jw8zMcnPyMDOz3Jw8zMwsNycPMzPLzcnDzMxyc/IwM7Pc/CVBa1Xd5Q9vdh2zupUgEDNrM3zm\nYWZmuTl5mJlZbk4eZmaWm5OHmZnl5gHzcvKdZM2sg/KZh5mZ5ebkYWZmuTl5mJlZbu1mzEPSSOCn\nQGfg5xFxVTn35y/GmZk1r12ceUjqDFwHfB4YDIySNLi6UZmZ1a52kTyAYcDMiHgjIlYCdwPHVzkm\nM7OapYiodgytkvQPwMiIOCvN/xNwUERc1Gi9c4Bz0uyewGsVDXRjOwDvVjmGtsJtsZ7bYj23xXpt\npS12i4jera3UbsY8ihERY4Gx1Y6jgaQpEVFf7TjaArfFem6L9dwW67W3tmgv3VZzgV0K5vunMjMz\nq4L2kjyeAwZKGiBpC+AU4KEqx2RmVrPaRbdVRKyWdBHwG7JLdW+JiJerHFYx2kwXWhvgtljPbbGe\n22K9dtUW7WLA3MzM2pb20m1lZmZtiJOHmZnl5uRhZma5OXmYmVlu7eJqK2ufJPUFdk6zcyNiQTXj\nqSa3RcbtsKH23B6+2qrE2vOLoVQkDQFuAHqx/suc/YEPgAsi4vlqxVZpbouM22FDHaE9nDxKpCO8\nGEpF0jTg3Ih4tlH5cODGiNi/OpFVntsi43bYUEdoD3dblc5tNP9iuBVo8y+GEtqqcTsARMQkSVtV\nI6Aqcltk3A4bavft4eRROu3+xVBCj0p6GBgHzE5luwCnAb+uWlTV4bbIuB021O7bw91WJSLpGmAP\nmn4xvNn49vEdnaTPk/3myrrxH+ChiHikelFVh9si43bYUHtvDyePEmrvLwYzs2I5eVhFSTon/e5K\nzXNbZNwOG2ov7eEvCVZA+oVDy6jaAbQhbouM22FD7aI9PGBeGe3ixVBKkvYi6757NiKWFix6q0oh\nVY2kYUBExHOSBgMjgRkRcWOVQ6sqSeMi4rRabwcASYcAw4CX2kt7OHlUxspqB1BJki4GLgReBW6W\ndElEPJgW/4B2cjVJKUj6DvB5oIukx4CDgN8Cl0v6RERcWdUAK0RS4x9vE3CYpG0AIuK4ykdVPZIm\nR8SwNH022fvlfuA7koZGxFVVDbAIHvOoAEl/jYhdqx1HpUiaDnwyIpZKqgPuA8ZHxE8l/SkiPlHV\nACsotcVt4JhDAAAE4klEQVQQYEtgPtA/IpZI6k52VrZfVQOsEEnPA68APweCLHn8guxXQYmIp6oX\nXeUVvg8kPQccFREL02X9kyJi3+pG2DqfeZSIpBebWwT0rWQsbUCnhq6qiJglaQRwn6TdqL0uvNUR\nsQb4m6S/RMQSgIhYJmltlWOrpHrgEuDfgP8XEdMkLau1pFGgk6RtycadO0fEQoCI+EjS6uqGVhwn\nj9LpCxwJvN+oXMAfKh9OVS2QNCQipgGkM5BjgFuANv+JqsRWSuoREX8DDmgolNQLqJnkERFrgTGS\n7k3/F1Dbx59ewFSy40NI6hcR8yT1pJ18wKrlJ6/U/hfo2XDALCTpycqHU1WnARt8eoqI1cBpktrF\nYGAJ/X1ErIB1B9AGXYHTqxNS9UTEHOAfJR0NLKl2PNUSEXXNLFoLnFjBUDaZxzzMzCw3f8/DzMxy\nc/IwM7PcnDysQ5IUku4omO8iaaGk/y1B3WW9AELSGSnWaZJmSPqXIrYZIelTBfPnSTqtnHFabfOA\nuXVUHwH7SOoeEcuAz7L+R7qKIqlLGujfYD4iPtXSdiVyT0RcJGl74DVJ90XE7BbWHwEsJV3ZFxE3\nVCBGq2E+87CO7BHg6DQ9iuxLaUB2yxBJf5T0J0l/kLRnKj9D0kOSngAeT5/of5++If1KWmdp+t9T\n0uOSnpc0XdLxBfV/W9Jrkp6W9AtJl6XyPST9WtLUVO9eLT2AiFgEzAT6pe2PlfRsinuipL7pi5jn\nAf+SzlYOlTS6YJ9PSvqRpMmS/izp0FTeQ9IESa9Iuj/VW7+ZbW41wmce1pHdDfx76qraj+x7Joem\nZTOAQyNitaQjyG6b8oW0bCiwX0S8l77gOBTYJyLebFT/cuDE9I3xHYBJKcnUp7r2J7sk93mya/oB\nxgLnRcTrkg4Crgc+09wDkLQr0A1o+BLq08DwiAhJZwFfj4hLJd0ALI2Iq9N2hzeqqktEDJN0FPAd\n4AjgAuD9iBgsaR9go8vMzZrj5GEdVkS8mD6VjyI7CynUC7hd0kCy22V0LVj2WES8VzA/uYnEAdmX\nuX4g6e/Jrs/fmezLogcDD0bEcmC5pF9BdqYCfAq4V1r3PbAtmwn/5FTvXsBFqS6A/sA9kvoBWwBN\nxdWUX6b/U4G6NH0I8FOAiHiphbskmG3E3VbW0T0EXE1Bl1XyPeC3EbEPcCzZp/sGHzVat/F8g1OB\n3sABETEEWNConsY6AR9ExJCCv0HNrHtPuu/Vp4CrJO2Yyq8F/jvd++jcVvZXaEX6vwZ/aLQScPKw\nju4W4LsRMb1ReS/WD6CfsYl19wLeiYhVkg4DdkvlzwDHSuqWzjaOAUj3tXpT0j8CKLN/SzuIiCnA\neLL7QjWOu/Ab6h8CW+eM/xngiymWwdTerWNsMzh5WIcWEXMi4pomFv0n8ENJf2LTP4nfCdSnO+ee\nRjaOQkQ8R3bG8yLwKDAdWJy2ORU4U9ILwMtkP1vcmh8BX5G0NTCarNtrKvBuwTq/Ak5sGDAvMv7r\ngd6SXgG+n+JZ3PImZhnfnsSsDCT1TDeE7AH8DjgnIp6vdlyFJHUGukbEckl7ABOBPSOipn5/xjaN\n+z7NymNs6grqBtze1hJH0gP4raSuZIP/FzhxWLF85mFmZrl5zMPMzHJz8jAzs9ycPMzMLDcnDzMz\ny83Jw8zMcvv/Urlwcntmc2oAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0xbdd8e48>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# barplot of marriage rating grouped by affair (True or False)\n",
    "pd.crosstab(dta.rate_marriage, dta.affair.astype(bool)).plot(kind='bar')\n",
    "plt.title('Marriage Rating Distribution by Affair Status')\n",
    "plt.xlabel('Marriage Rating')\n",
    "plt.ylabel('Frequency')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's use a stacked barplot to look at the percentage of women having affairs by number of years of marriage."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.text.Text at 0xbf97128>"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEjCAYAAADdZh27AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xm8FnXd//HXm03AXXBLVpNUvN0SUdvU1IQy8TYTs3JF\nxL1uf7/k9lbT0tLyLi1JXDJzRaXMjbS0rLBQMBFFUhFRDrnghisI8rn/mO8ZLg5nuc7hzLnOdc77\n+Xhcj8fMd2a+12euA/OZ+X5nvqOIwMzMDKBLpQMwM7P2w0nBzMxyTgpmZpZzUjAzs5yTgpmZ5ZwU\nzMws56TQgUjqJekuSYsl3ZbKzpf0mqSXy9h+oqSzi4+0c5C0l6SaSsfRUUg6UtLvW7jtGEkPtnJI\nHZKTQhWS9KCkNyWtVWfRIcCmQJ+I+KqkAcDpwNCI2KypeiNiXER8vxlxHCXpI0nvSnpb0kxJBzRr\nZwqWYpxa6TjWRDqgzSn9e0vqI+lVSSMqGNdWkkLS9Drlm0paJmlua35fRPw6Ika2Zp22OieFKiNp\nEPBZIIAD6yweCDwTEcvT/ADg9Yh4tRW+t1sDi/4REesAGwC/BG6VtGEr1W1ARFwNLATOKSm+BJgS\nEfe25ne18G+xnqRtS+a/DsxrzRj8b6TtOClUnyOAacC1wJG1hZLOIztojE5n7scDfwQ+luavTevd\nJunl1MT0V0nbldRxraTz0/RekmoknZGann7VWFARsQK4BugFfDzVcUC6enhL0t8l7VDyXfNT3bOA\n9yR1k9Rf0m8lLZL0uqTLStY/Jp0tvynpPkkDS5aFpHGSnk3fNUGZbYGJwB7pN3grrf8lSY+lq5sF\nks4t3RdJR0h6IcVwdop137Ssi6Txkp5Ly2+VtFFjv42kM1MT3nxJX09lu0p6RVLXkvUOlvR4A9WM\nAU6UtJOk/YF9gG+XbHugpMfT/k+V9B8ly86SNE/SO5JmSzqwZNmY9O/gZ5LeAM6S9IlUtjjFfVNj\n+wdcT/bvstYRwHV1foPmxtBQ2YMl2w2VdL+kNyT9S9JXSpZtLOnu9DeeBgxuYh+sVkT4U0UfYC5w\nIrALsAzYtGTZucANJfN7ATV1tj8GWBdYi+xsc2bJsmuB80u2XQ5clNbtVU8sRwFT03Q34DTgHWB9\nYGfgVWA3oCtZApsPrJXWnw/MBPqTJZKuwOPAT4G1gZ7AZ9K6o9J+b5u+5yzg7yVxBHA32dXKAGAR\nMKJujHV+l+3JTop2AF4BDkrLhgLvAp8BegAXp99537T8NLKk3C/9LlcANzfwt6r9DX+S1t0TeA/Y\nOi1/ChhZsv7twOmN/O1PAf4JPF8bbyrfNe3Drul3PAZ4DuiRlh8KbJ729/C0f5umZWNSjCekbXsB\ntwFnpPV7Ap9uIJ6t0m8/EHghrb898CQwAphbsm5zY2io7MG0zTpkV09HpH8TuwCvl/y2k4Gbgd7p\nb/xS7bb+NHGMqXQA/jTjj5UdqJYBfdP8v4Bvlyw/lyaSQp36Nkj/qddP89eyalL4EOjZyPZHpf+4\nbwGvkR0saw+elwPfr7P+08CeaXo+cEzJsj3IDubd6vme3wPHlsx3Ad4HBqb5ICWQNH8rML4kxqkN\n7UNa5xLgp2n6HEoO8umg8mHJfs0B9ilZvnn6m9QX917p91m7Tmxnp+kzgBvT9EZpnzZvJE4BDwO3\n1ym/CvhunbLnaPhg/iTwpTQ9BphXZ/lN6e+3RRO/21ZApOkHya5eLk77tUpSaEEMDZU9mKa/Dvy5\nzvJfAv8DdE+/+1Yly36Ek0JZHzcfVZcjgT9ExGtp/iZKmpCaIqmrpAtT08fbZAdmgL4NbLIoIpY0\nUe20iNggIvpGxO4RcX8qHwicnpoz3kpNN/2Bj5Vsu6Bkuj/wQqzsDyk1ELi0pJ43yA6QW5SsU3p3\n1ftkZ5L1krSbpD+nZqrFwDhW/gYfK40rIt4nOwMtjeX2kljmAB+RdfDX582IeK9k/gVW/gY3AF+W\ntDbZmfTfIuKlhuJOR985wOw6iwYCZ9T5rTcn/T7KOtsfL1m2Dav+zRfUqe90sgPrDElPSCrn39h1\nwNHAYWm/VtGCGBoqqzUQ+HSdfR5Ntt+bkl1dlG7/Qhn7YGSXXVYFJPUiO3B01crbS9cCNpC0Y0Q0\n1BZd6nCypph9yRLC+sCbZAfY+qzJELoLgAsi4oJG1imtfwEwQFK3ehJDbV03tiCO+vbhJuAysqab\nJZIuYeUB6iVg69oV0+/ep04sx0TEQ2V+/4aS1i5JDAPIzpKJiIWS/gEcDHyT7Oy8JRYA50XERXUX\nSNoy1bsP8HBEfCTpSVb9m6/yG6XENCZt/zngj5L+GhHPNxLDbcDPyG48WChp+zWJoZGyWguAB6Ke\nu5EkdQdWkJ1o1N4BNaCRuqyErxSqx0FkZ6RDgZ3SZ1vgb6zaydeYdYGlZGe+vYEftH6YuauAcems\nXJLWTh286zaw/iNkB+QL07o9JX06LZsI/LdSp7ik9SV9tcw4XgH6SepRUrYu8EZKCMPJkmWtyWRn\n759K25zLqgevicAFSh3dqUNzVBMxnCeph6TPAgeQHUBrXQd8h6wt/rdl7lNdVwEnpc5rSVpHUu0V\nyDpkB9dFWbg6juwsvUGSDpVUexX2Vtr+o8a2iYh3gL2B4+tZ3OwYynAnsJ2kwyV1T5/hkraOiGXA\n78h+917KOt2/uYbf12k4KVSPI4FfRcSLEfFy7YfsjPfrKu+WvevILqMXknVyTisq2IiYARyX4nuT\n7IztqEbW/wj4Mlk79YtADVlzABFxO1mH96TU7PUkUO796n8ia255WVJts9uJwPckvUPWh3BrSRyz\nyTp0J5ElqXfJOsyXplUuJTsg/SFtP42sM70hL5Pt/7+BG4FxEfGvkuW3k5qkUlNVs0XENLIO2cvT\ndz0DfCMtmwX8nJVJd2uyfonG7AZMl/QeWaI6KSJeLCOO6RGx2q2oLYyhqe9aDOxPtp8vkf3OPyS7\neobs99iQ7KTglzRx95ytpNQJY2b1kLQO2dnykCaaT9bkO54Dji/pjzGrGF8pmNWRml56p+aXi4En\nWNkp39rf9RWyppU/FVG/WXO5o9lsdaPIHsgSMAM4LAq4pE4PYg0FvhnZw39mFefmIzMzy7n5yMzM\nck4KZmaWq7o+hb59+8agQYMqHYaZWVV59NFHX4uIjZtar+qSwqBBg5gxY0alwzAzqyqSyhrqw81H\nZmaWc1IwM7Ock4KZmeWcFMzMLOekYGZmucKSgqRrJL2axk2vb7nS+1fnSpol6ZNFxWJmZuUp8krh\nWrJX8jVkJDAkfcbS8heMmJlZKyksKUTEX8lem9iQUcB1kZlG9gaxzYuKx8zMmlbJh9e2YNV3qNak\nstXeUStpLNnVBAMGNPOteueu3+IAy6t/ccH1O/7G63f8DdddxbGD42+y/mLir4qO5oi4MiKGRcSw\njTdu8iltMzNroUomhYVkL9au1S+VmZlZhVQyKdwJHJHuQtodWBwRqzUdmZlZ2ymsT0HSzcBeQF9J\nNcB3ge4AETERmAJ8keyF7u8DRxcVi5mZlaewpBARX2tieQAnFfX9ZmbWfFXR0WxmZm3DScHMzHJO\nCmZmlqu6N68116AlNxVa//xCazcza1u+UjAzs5yTgpmZ5ZwUzMws56RgZmY5JwUzM8s5KZiZWc5J\nwczMck4KZmaWc1IwM7Ock4KZmeWcFMzMLOekYGZmOScFMzPLOSmYmVnOScHMzHJOCmZmlnNSMDOz\nnJOCmZnlnBTMzCznpGBmZjknBTMzyzkpmJlZzknBzMxyTgpmZpZzUjAzs5yTgpmZ5ZwUzMws56Rg\nZma5QpOCpBGSnpY0V9L4epavL+kuSY9Lmi3p6CLjMTOzxhWWFCR1BSYAI4GhwNckDa2z2knAUxGx\nI7AX8L+SehQVk5mZNa7IK4XhwNyImBcRHwKTgFF11glgXUkC1gHeAJYXGJOZmTWiyKSwBbCgZL4m\nlZW6DNgW+DfwBHBaRKwoMCYzM2tEpTua9wdmAh8DdgIuk7Re3ZUkjZU0Q9KMRYsWtXWMZmadRpFJ\nYSHQv2S+XyordTTw28jMBZ4HtqlbUURcGRHDImLYxhtvXFjAZmadXZFJYTowRNLg1Hl8GHBnnXVe\nBPYBkLQpsDUwr8CYzMysEd2Kqjgilks6GbgP6ApcExGzJY1LyycC3weulfQEIOCMiHitqJiq0aAl\nNxVa//xCazezalNYUgCIiCnAlDplE0um/w18ocgYzMysfJXuaDYzs3bEScHMzHJOCmZmlnNSMDOz\nnJOCmZnlnBTMzCznpGBmZjknBTMzyzkpmJlZzknBzMxyTgpmZpZzUjAzs5yTgpmZ5ZwUzMws56Rg\nZmY5JwUzM8s5KZiZWc5JwczMck4KZmaWc1IwM7Ock4KZmeWcFMzMLOekYGZmOScFMzPLdat0ANax\nDVpyU6H1zy+0drPOp+wrBUm9JG1dZDBmZlZZZSUFSV8GZgL3pvmdJN1ZZGBmZtb2yr1SOBcYDrwF\nEBEzgcEFxWRmZhVSblJYFhGL65RFawdjZmaVVW5H82xJhwNdJQ0BTgX+XlxYZmZWCeVeKZwCbAcs\nBW4G3ga+VVRQZmZWGWVdKUTE+8D/pI+ZmXVQZSUFSXexeh/CYmAGcEVELGlguxHApUBX4OqIuLCe\ndfYCLgG6A69FxJ5lR29mZq2q3OajecC7wFXp8zbwDvCJNL8aSV2BCcBIYCjwNUlD66yzAfAL4MCI\n2A74agv2wczMWkm5Hc2fiohdS+bvkjQ9InaVNLuBbYYDcyNiHoCkScAo4KmSdQ4HfhsRLwJExKvN\nC9/MzFpTuVcK60gaUDuTptdJsx82sM0WwIKS+ZpUVuoTwIaSHpT0qKQjyozHzMwKUO6VwunAVEnP\nASJ7cO1ESWsDv17D798F2AfoBfxD0rSIeKZ0JUljgbEAAwYMWK0SMzNrHeXefTQlPZ+wTSp6uqRz\n+ZIGNlsI9C+Z75fKStUAr0fEe8B7kv4K7AiskhQi4krgSoBhw4b5oTkzs4I0Z+jsIcDWZAftQ8to\n6pkODJE0WFIP4DCg7nhJdwCfkdRNUm9gN2BOM2IyM7NWVO4tqd8F9iK7i2gK2R1FU4HrGtomIpZL\nOhm4j+yW1GsiYrakcWn5xIiYI+leYBawguy21SfXYH/MLPGw5dYS5fYpHEJ2hfBYRBwtaVPghqY2\niogpZEmktGxinfkfAz8uMw4zMytQuUnhg4hYIWm5pPWAV1m1v8CsQ/LZtrVUtf7bKTcpzEgPml0F\nPEr2INs/CorJzMwqpNy7j05MkxNTH8B6ETGruLDMzKwSyn3z2gO10xExPyJmlZaZmVnH0OiVgqSe\nQG+gr6QNyR5cA1iP1Z9ONjOzKtdU89HxZO9N+BhZX0JtUngbuKzAuMzMrAIaTQoRcSlwqaRTIuLn\nbRSTmVnV3r1T7crtaP65pE8Bg0q3iYgGH14zM7PqU+4TzdcDHwdmAh+l4qCRJ5rNzKz6lPucwjBg\naER4MDozsw6s3AHxngQ2KzIQMzOrvHKvFPoCT0l6BFhaWxgRBxYSlZmZVUS5SeHcIoMwM7P2ody7\nj/4iaSAwJCLuT+8+6FpsaGZm1tbKHebiOGAycEUq2gL4XVFBmZlZZZTb0XwS8GmyJ5mJiGeBTYoK\nyszMKqPcpLA0Ij6snZHUjew5BTMz60DKTQp/kXQm0EvSfsBtwF3FhWVmZpVQblIYDywCniAbJG8K\ncFZRQZmZWWWUe0tqL+CaiLgKQFLXVPZ+UYGZmVnbK/dK4QGyJFCrF3B/64djZmaVVG5S6BkR79bO\npOnexYRkZmaVUm5SeE/SJ2tnJO0CfFBMSGZmVinl9imcBtwm6d9kb1/bDBhdWFRmZlYRTSYFSV2A\nHsA2wNap+OmIWFZkYGZm1vaaTAoRsULShIjYmWwIbTMz66DKvvtI0lckqdBozMysospNCseTPcX8\noaS3Jb0j6e0C4zIzswood+jsdYsOxMzMKq/cobMl6RuSzk7z/SUNLzY0MzNra+U2H/0C2AM4PM2/\nC0woJCIzM6uYcp9T2C0iPinpMYCIeFNSjwLjMjOzCij3SmFZGgQvACRtDKwoLCozM6uIcpPCz4Db\ngU0kXQBMBX7Q1EaSRkh6WtJcSeMbWW9XScslHVJmPGZmVoBy7z66UdKjwD5kw1wcFBFzGtsmXVlM\nAPYDaoDpku6MiKfqWe8i4A8tiN/MzFpRo0lBUk9gHLAV2Qt2roiI5WXWPRyYGxHzUl2TgFHAU3XW\nOwX4DbBrM+I2M7MCNNV89GtgGFlCGAlc3Iy6twAWlMzXpLKcpC2A/wQub0a9ZmZWkKaaj4ZGxPYA\nkn4JPNLK338JcEYaX6nBlSSNBcYCDBgwoJVDMDOzWk0lhXwk1IhY3syhjxYC/Uvm+6WyUsOASane\nvsAXJS2PiN+VrhQRVwJXAgwbNiyaE4SZmZWvqaSwY8kYRwJ6pXkBERHrNbLtdGCIpMFkyeAwVj78\nBlkFg2unJV0L3F03IZiZWdtpNClERNeWVpyuLE4G7gO6AtdExGxJ49LyiS2t28zMilHuE80tEhFT\ngCl1yupNBhFxVJGxmJlZ08p9eM3MzDoBJwUzM8s5KZiZWc5JwczMck4KZmaWc1IwM7Ock4KZmeWc\nFMzMLOekYGZmOScFMzPLOSmYmVnOScHMzHJOCmZmlnNSMDOznJOCmZnlnBTMzCznpGBmZjknBTMz\nyzkpmJlZzknBzMxyTgpmZpZzUjAzs5yTgpmZ5ZwUzMws163SAZi1tmXLllFTU8OSJUvWuK6rDty8\nFSJq2Jw5c1qlnp49e9KvXz+6d+/eKvVZ5+WkYB1OTU0N6667LoMGDULSGtW1rOatVoqqftv222CN\n64gIXn/9dWpqahg8eHArRGWdmZuPrMNZsmQJffr0WeOEUC0k0adPn1a5MjJzUrAOqbMkhFqdbX+t\nOE4KZmaWc1Iwa6Y/3P07Dtp7N4499MsAnHHSsRyy36e5/qpfNLjNrddfw12TJ7VViGYt5o5ms2a6\nfdINnHPRJXxy+B689uorzH78Me6e+s9Gtzn0m8fUW758+XK6dfN/Q2s//K/RrBHfOvbrvPzSQpYu\nXcrXjzme1xa9ymPTp3Hu/z+VvfYbwd//8ideffklDt3/s4z/3kU8/9yz/ObGX7Ns2Yf0H7QlF1w6\nkV69enP5Ty6kd++1OXLcKRz71QPYervteeyRaRxz5Dc4/fTTK72bZrlCk4KkEcClQFfg6oi4sM7y\nrwNnAALeAU6IiMeLjMmsOc67+DLW33BDlnzwAYcf8HmumXwP0x/6K/911vfZbsedGX3kcZxy1Ghu\nve9vAGz5iW34yuFHAnDZj87n9kk3cPjRY1erd9mHH3LzlD+zQyvckmrWmgpLCpK6AhOA/YAaYLqk\nOyPiqZLVngf2jIg3JY0ErgR2Kyoms+a66VdX8Kd77wbglZcW8uLzzzW6/tx/zeGyH5/PO28v5v33\n3+NTe36+3vX2//LBrR6rWWso8kphODA3IuYBSJoEjALypBARfy9ZfxrQr8B4zJrlwQcfZNrUB7nu\njj/Qq1dvjv3qASxdurTRbc4+/UQuufoGth66PXfcehMz/jG13vV69e5dRMhma6zIu4+2ABaUzNek\nsoYcC/y+wHjMmmXx4sWst/4G9OrVm+fnPsOsx2Y0uc37775L3002Y9myZUz53W1tEKVZ62oXHc2S\n9iZLCp9pYPlYYCzAgAED2jAy68xGjBjBjy/5OQftvRuDttyKHXYe1uQ2J/2/M/nGgfuy4UZ92X7n\nXXj/3XfbIFKz1lNkUlgI9C+Z75fKViFpB+BqYGREvF5fRRFxJVl/A8OGDYvWD9VsdWuttRa/uH7y\nauW73nZ3Pr1F/wH89oF/5POHHnEshx5x7GrbnPBf4/PpX5Zsb9beFNl8NB0YImmwpB7AYcCdpStI\nGgD8FvhmRDxTYCxmZlaGwq4UImK5pJOB+8huSb0mImZLGpeWTwTOAfoAv0hjtyyPiKav0c3MrBCF\n9ilExBRgSp2yiSXTY4AxRcZgZmbl89hHZmaWc1IwM7Ock4KZmeXaxXMKZkUaNP6eVq3vzpM/3eQ6\nOw/sw5BthubzP736RrboX/8zNvPnz+eAAw7gySefbLUYzVrKScGsAGv17JUPkmdWTdx8ZNZGFi54\nkaMOHsnokXsyeuSezJzx8GrrzJ49m+HDh7PTTjuxww478OyzzwJwww035OXHH388H330UVuHb52E\nk4JZAZYu+YBD9/8sh+7/Wb415hsAbNS3L1fcdDu3/P4v/OgX13DROeNX227ixImcdtppzJw5kxkz\nZtCvXz/mzJnDLbfcwkMPPcTMmTPp2rUrN954Y1vvknUSbj4yK0B9zUfLly3jh2d/h6dnP0HXrl15\nYd7qw3DvscceXHDBBdTU1HDwwQczZMgQHnjgAR599FF23XVXAD744AM22WSTNtkP63ycFMzayA1X\nX06fvptw2x+msmLFCoZvtdlq6xx++OHstttu3HPPPXzxi1/kiiuuICI48sgj+eEPf1iBqK2zcfOR\nWRt59+236bvJpnTp0oW7f3NLvf0C8+bNY8stt+TUU09l1KhRzJo1i3322YfJkyfz6quvAvDGG2/w\nwgsvtHX41kn4SsE6vPkXfqnF286qeavV4jj0yGM5fewR3P2bSXxqr33o1Xvt1da59dZbuf766+ne\nvTubbbYZZ555JhtttBHnn38+X/jCF1ixYgXdu3dnwoQJDBw4sNViM6vlpGBWgGlP16xWNnDwx5n8\nx4fy+W+feR4AgwYNyp9RGD9+POPHr94BPXr0aEaPHl1QtGYrufnIzMxyTgpmZpZzUjAzs5yTgpmZ\n5ZwUzMws56RgZmY535JqHd+567d40x3qKZs1puEHx9568w3GHjYKgNcWvUqXLl3ZqE8fAG686wG6\n9+jR4ljM2oKTglkr2mDDjfIxjy7/yYX07r02R447ZZV1IoKIoEsXX6hb++N/lWZt4MXn5/Gfn9+d\n/z7lOA7eZw9e/ncNn9lu5RPJkyZNYsyYMQC88sorHHzwwQwbNozhw4czbdq0SoVtnZCvFMzayPNz\nn+H8n17OdjvuzPLlyxtc79RTT+U73/kOu+++u9/KZm3OScGsjfQfOJjtdty5yfXuv/9+nn766Xz+\nzTff5IMPPqBXr15FhmcGOCmYtZlevXvn0126dCEi8vklS5bk0xHBI488Qg93SlsFuE/BrAK6dOnC\neutvwLPPPsuKFSu4/fbb82X77rsvEyZMyOdnzpxZiRCtk/KVgnV85y5u8aatOXR2Xaf997nsv//+\nbLLJJuyyyy4sXboUgAkTJnDCCSfwq1/9iuXLl7P33nuvkiTMiuSkYFaQE/5r5RDYAwZvudrrOUcc\neDDfOfGY1bbbeOONmTx5cuHxmdXHzUdmZpZzUjAzs5yTgnVIpXf2dAadbX+tOE4K1uH07NmT119/\nvdMcKCOC119/nZ49e1Y6FOsA3NFsHU6/fv2oqalh0aJFa1zXK29+0AoRNWzOO63zQFrPnj3p169f\nq9RlnZuTgnU43bt3Z/Dgwa1S18jx97RKPQ2Zf+GXCq3frLkKbT6SNELS05LmShpfz3JJ+llaPkvS\nJ4uMx8zMGldYUpDUFZgAjASGAl+TNLTOaiOBIekzFri8qHjMzKxpRV4pDAfmRsS8iPgQmASMqrPO\nKOC6yEwDNpC0eYExmZlZI1TUHRqSDgFGRMSYNP9NYLeIOLlknbuBCyNiapp/ADgjImbUqWss2ZUE\nwNbA0xSnL/BagfUXzfFXVjXHX82xg+NvysCI2LiplaqiozkirgSubIvvkjQjIoa1xXcVwfFXVjXH\nX82xg+NvLUU2Hy0E+pfM90tlzV3HzMzaSJFJYTowRNJgST2Aw4A766xzJ3BEugtpd2BxRLxUYExm\nZtaIwpqPImK5pJOB+4CuwDURMVvSuLR8IjAF+CIwF3gfOLqoeJqhTZqpCuT4K6ua46/m2MHxt4rC\nOprNzKz6eOwjMzPLOSmYmVnOScHMzHJOCmZmlquKh9fakqSNIuKNSsfRUtUYv6RNgS3S7MKIeKWS\n8TRHNcdulSVpG7KhfvJ/P8CdETGnclF18ruPJJ0VEeen6aHA74DugIDREfFwJeNrSgeIfydgIrA+\nKx9a7Ae8BZwYEf+sVGxNqebYS7XXA1NzSfoM2XhrT0bEHyodT1MknQF8jWxMuJpU3I/sea5JEXFh\npWIjIjrtB/hnyfQ9wMg0PRz4e6Xj6wTxzyQbD6tu+e7A45WOr6PGXhLrGWk/xgPfSJ/xtWWVjq+J\n2B8pmT4uxfxd4KH2HnuK+Rmgez3lPYBnKxmbm49W2iIifg8QEY9Iap1XYrWdaox/7ajnaiYipkla\nuxIBNUM1x17rWGC7iFhWWijpJ8BsoHJnq03rXjI9FtgvIhZJuhiYRvuOHWAF8DHghTrlm6dlFdPZ\nk8KWku4ka27pJ6l3RLyflnVvZLv2otrj/72ke4DrgAWprD9wBHBvxaIqTzXHXqvdHpjK0EXShmQ3\ny3SNiEUAEfGepOWVDa0s3wIekPQsK//9DAC2Ak5ucKs20NmTQt33O3SBvPOwGl74U9XxR8Spkkay\nepv2hIiYUrnImlbNsZdotwemMqwPPEp2QhSSNo+IlyStk8ratYi4V9InyJp6S//9TI+IjyoXWSfv\naDbr7CR1oR0emFpKUm9g04h4vtKxVCs/p9CA9GKfquX4K6eaYo+IFRExLSJ+kz7TqjUhAETE+9We\nENLLxyrGSaFh7f4StAmOv3KqOXag8gemNVHNsSfHVfLLO33zUbXfp53i3wJ4OCLeLSkfERHtusNT\n0m7AnIh4O90tNR74JPAU8IOIWFzRAJsgaUvgYLIO5o/IbjO8KSLermhgraC2jb7ScbRENcfeHnTq\nK4X0AMkksjO7R9JHwM2SxlcytnJIOhW4AzgFeFJSacfzDyoTVbNcQ/YeDYBLyToPL0plv6pUUOVI\nv/1EoCewK7AWWXKYJmmvCobWKqrxoCqpD1RH7JI2k3S5pAmS+kg6V9ITkm6VtHlFY+vMVwqSnqH+\n+7R7ALNox5XSAAAF5ElEQVQjYkhlIiuPpCeAPSLiXUmDgMnA9RFxqaTHImLnigbYBElzImLbNP3P\niPhkybKZEbFT5aJrXPrtd4qIj1Ln5pSI2EvSAOCO9v7bQ3ZgInvgawVwDtnJxVeAOcBp7fngKulC\n4OKIeE3SMOBWsv3oDhwREX+paIBNkHQv2QOnawOHAzcCNwEHAftGRN07C9tMp75SYOV92nVVw33a\nAF1qm4wiYj6wFzAyPXxUDe3aT0qqfdve4+k/N+lWvWUNb9Zu1N7SvRawDkBEvEh1PCMCcC1ZU90C\n4M/AB2RvQvwb2VVQe/aliHgtTf+YbFiXrYD9gP+tXFhl2zQifh7ZcBYbRMRFEbEgIn4ODKxkYJ39\nOYVqvk8b4BVJO0XETIB0xXAAWbPM9pUNrSxjgEslnQW8BvxD0gKyv8WYikbWtKuB6ZIeBj5L1uyF\npI2BahmQcNN0EELSiRFxUSr/uaRjKxhXObpJ6hYRy4FeETEdICKekbRWhWMrR+kJ+XV1lnVty0Dq\n6tTNR1Dd92lL6gcsj4iX61n26Yh4qAJhNZuk9YDBZCcpNVElI41K2g7YlmwQtn9VOp7mkvR4ROyY\nps+PiLNKlj0REe32xELSKcCXyYaz+BywIfBb4PPAlhHxzQqG1yRJ3wN+VHpzSCrfCrgwIg6pTGRO\nCmadVns+MJUjdeifAHyC7IRiAdlIwdekK4h2rb3eOeikYGarkXR0RLTrO8AaUg2xpyudk8k69Xci\n69i/Iy1b5aaLNo/NScHM6pL0YkQMqHQcLVENsbfnOwc7e0ezWaclaVZDi4BN2zKW5qrm2JNV7hxM\nTWGTJQ2kwncOOimYdV6bAvsDb9YpF/D3tg+nWao5dmjHdw46KZh1XncD69QemEpJerDtw2mWao4d\nsvdurNIZnjrHj5B0RWVCyrhPwczMcp39iWYzMyvhpGBmZjknBat6ykxNr8esLftqGnSs6O/eV1JI\nOqqkbFgq+9Ya1r2bpJ82c5upktrtQILW/jkpWNWLrGNsHPATST3Te3p/AJy0JvVKKvdGjCeA0SXz\nXwMeX5PvSuP6PBwR325OPWZryncfWYcQEU9Kugs4g2w44usi4jlJR5Ilhx5ktyqeHBErJF1J9kKf\nXsAtEfE9AEk1wA1ktzv+II0vdRzZnSKzIuIb9Xz9PGBjSX3JBsPbD/h97UJJ44BjUwzPkA3t/IGk\nG4B3gF2AByV9SDYg48eB5yVdm+I9KCW6y4ChZKOwnhMRd6Vhu38N/AfZiKc91/jHtE7NScE6kvOA\nfwIfAsMk/Qfwn8CnImJ5SgSHkY1bPz4i3khn6H+WNDkinkr1vFr7RKmkl4CBEfGhpA0a+e7fAIeQ\nDVvwMKsO/X1bRExM9V0IHAVcnpZtDuyeEtX5wDbA5yJiiaR9S+o4B7g3Io6StCHwsKQ/kg2V8GZE\nbCtpZ2BGc380s1JOCtZhRMR7km4B3o2IpemguiswQxJkVwW1Q6R/LQ0P3Y3snRpDyc60AW4pqXY2\ncIOkO8gGW2vILcD1ZFcCN5ON1llrhzT43AbAumT32Ne6LSJK391xR0Qsqaf+L5C9K6P2jYA9ya4q\nPgf8KO3/Y5JmNxKjWZOcFKyjWcHKFySJbMTMs0tXkDQEOA0YHhFvpWac0maX90qm9wf2BA4EzpS0\nQ33DqkfEQmWZZ0/gRFZNCtcBI1MT1xhg9wa+q775PGzgoIh4rs6+NLC6Wcu4o9k6svuBQ1NbP8re\nhTsAWI+sLf9tZe/D3b++jSV1BfpFxJ+A7wB9gd6NfN/ZwBl1zvwh6+N4WVJ3slcvtsR9ZK/LrI2t\ndsC0v9bWKWlHYLsW1m8G+ErBOrCIeELSecD96WVKy8juUppB1lT0L+AFoKGXEXUDbpK0LtkJ1MUR\n8U4j3ze1gUXnANOBRcAjtKwz+DzgkjS6ZhdgLjCKrPP515LmkDV1PdaCus1yHubCzMxybj4yM7Oc\nk4KZmeWcFMzMLOekYGZmOScFMzPLOSmYmVnOScHMzHJOCmZmlvs/l1T6VmDvSecAAAAASUVORK5C\nYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0xc00ceb8>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "affair_yrs_married = pd.crosstab(dta.yrs_married, dta.affair.astype(bool))\n",
    "affair_yrs_married.div(affair_yrs_married.sum(1).astype(float), axis=0).plot(kind='bar', stacked=True)\n",
    "plt.title('Affair Percentage by Years Married')\n",
    "plt.xlabel('Years Married')\n",
    "plt.ylabel('Percentage')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prepare Data for Logistic Regression\n",
    "\n",
    "To prepare the data, I want to add an intercept column as well as dummy variables for `occupation` and `occupation_husb`, since I'm treating them as categorial variables. The dmatrices function from the [patsy module](http://patsy.readthedocs.org/en/latest/) can do that using formula language."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Index(['Intercept', 'C(occupation)[T.2.0]', 'C(occupation)[T.3.0]',\n",
      "       'C(occupation)[T.4.0]', 'C(occupation)[T.5.0]', 'C(occupation)[T.6.0]',\n",
      "       'C(occupation_husb)[T.2.0]', 'C(occupation_husb)[T.3.0]',\n",
      "       'C(occupation_husb)[T.4.0]', 'C(occupation_husb)[T.5.0]',\n",
      "       'C(occupation_husb)[T.6.0]', 'rate_marriage', 'age', 'yrs_married',\n",
      "       'children', 'religious', 'educ'],\n",
      "      dtype='object')\n"
     ]
    }
   ],
   "source": [
    "# create dataframes with an intercept column and dummy variables for\n",
    "# occupation and occupation_husb\n",
    "y, X = dmatrices('affair ~ rate_marriage + age + yrs_married + children + \\\n",
    "                  religious + educ + C(occupation) + C(occupation_husb)',\n",
    "                  dta, return_type=\"dataframe\")\n",
    "print(X.columns)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The column names for the dummy variables are ugly, so let's rename those."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# fix column names of X\n",
    "X = X.rename(columns = {'C(occupation)[T.2.0]':'occ_2',\n",
    "                        'C(occupation)[T.3.0]':'occ_3',\n",
    "                        'C(occupation)[T.4.0]':'occ_4',\n",
    "                        'C(occupation)[T.5.0]':'occ_5',\n",
    "                        'C(occupation)[T.6.0]':'occ_6',\n",
    "                        'C(occupation_husb)[T.2.0]':'occ_husb_2',\n",
    "                        'C(occupation_husb)[T.3.0]':'occ_husb_3',\n",
    "                        'C(occupation_husb)[T.4.0]':'occ_husb_4',\n",
    "                        'C(occupation_husb)[T.5.0]':'occ_husb_5',\n",
    "                        'C(occupation_husb)[T.6.0]':'occ_husb_6'})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We also need to flatten `y` into a 1-D array, so that scikit-learn will properly understand it as the response variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# flatten y into a 1-D array\n",
    "y = np.ravel(y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Logistic Regression\n",
    "\n",
    "Let's go ahead and run logistic regression on the entire data set, and see how accurate it is!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.72588752748978946"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# instantiate a logistic regression model, and fit with X and y\n",
    "model = LogisticRegression()\n",
    "model = model.fit(X, y)\n",
    "\n",
    "# check the accuracy on the training set\n",
    "model.score(X, y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "73% accuracy seems good, but what's the null error rate?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.32249450204209867"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# what percentage had affairs?\n",
    "y.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Only 32% of the women had affairs, which means that you could obtain 68% accuracy by always predicting \"no\". So we're doing better than the null error rate, but not by much.\n",
    "\n",
    "Let's examine the coefficients to see what we learn."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Intercept</td>\n",
       "      <td>[1.48983589132]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>occ_2</td>\n",
       "      <td>[0.188066390244]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>occ_3</td>\n",
       "      <td>[0.498947866816]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>occ_4</td>\n",
       "      <td>[0.250668564985]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>occ_5</td>\n",
       "      <td>[0.839008064812]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>occ_6</td>\n",
       "      <td>[0.833908433744]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>occ_husb_2</td>\n",
       "      <td>[0.190635944587]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>occ_husb_3</td>\n",
       "      <td>[0.297832712926]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>occ_husb_4</td>\n",
       "      <td>[0.161408854076]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>occ_husb_5</td>\n",
       "      <td>[0.18777091389]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>occ_husb_6</td>\n",
       "      <td>[0.194016372255]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>rate_marriage</td>\n",
       "      <td>[-0.703123359732]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>age</td>\n",
       "      <td>[-0.0584177744817]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>yrs_married</td>\n",
       "      <td>[0.105676537997]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>children</td>\n",
       "      <td>[0.0169192669709]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>religious</td>\n",
       "      <td>[-0.371136265314]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>educ</td>\n",
       "      <td>[0.00401650319564]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                0                   1\n",
       "0       Intercept     [1.48983589132]\n",
       "1           occ_2    [0.188066390244]\n",
       "2           occ_3    [0.498947866816]\n",
       "3           occ_4    [0.250668564985]\n",
       "4           occ_5    [0.839008064812]\n",
       "5           occ_6    [0.833908433744]\n",
       "6      occ_husb_2    [0.190635944587]\n",
       "7      occ_husb_3    [0.297832712926]\n",
       "8      occ_husb_4    [0.161408854076]\n",
       "9      occ_husb_5     [0.18777091389]\n",
       "10     occ_husb_6    [0.194016372255]\n",
       "11  rate_marriage   [-0.703123359732]\n",
       "12            age  [-0.0584177744817]\n",
       "13    yrs_married    [0.105676537997]\n",
       "14       children   [0.0169192669709]\n",
       "15      religious   [-0.371136265314]\n",
       "16           educ  [0.00401650319564]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# examine the coefficients\n",
    "pd.DataFrame(list(zip(X.columns, np.transpose(model.coef_))))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Increases in marriage rating and religiousness correspond to a decrease in the likelihood of having an affair. For both the wife's occupation and the husband's occupation, the lowest likelihood of having an affair corresponds to the baseline occupation (student), since all of the dummy coefficients are positive."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model Evaluation Using a Validation Set\n",
    "\n",
    "So far, we have trained and tested on the same set. Let's instead split the data into a training set and a testing set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n",
       "          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,\n",
       "          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,\n",
       "          verbose=0, warm_start=False)"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# evaluate the model by splitting into train and test sets\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)\n",
    "model2 = LogisticRegression()\n",
    "model2.fit(X_train, y_train)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now need to predict class labels for the test set. We will also generate the class probabilities, just to take a look."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 1.  0.  0. ...,  0.  0.  0.]\n"
     ]
    }
   ],
   "source": [
    "# predict class labels for the test set\n",
    "predicted = model2.predict(X_test)\n",
    "print(predicted)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 0.3514634   0.6485366 ]\n",
      " [ 0.90955084  0.09044916]\n",
      " [ 0.72567333  0.27432667]\n",
      " ..., \n",
      " [ 0.55727385  0.44272615]\n",
      " [ 0.81207043  0.18792957]\n",
      " [ 0.74734601  0.25265399]]\n"
     ]
    }
   ],
   "source": [
    "# generate class probabilities\n",
    "probs = model2.predict_proba(X_test)\n",
    "print( probs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, the classifier is predicting a 1 (having an affair) any time the probability in the second column is greater than 0.5.\n",
    "\n",
    "Now let's generate some evaluation metrics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.729842931937\n",
      "0.745950606951\n"
     ]
    }
   ],
   "source": [
    "# generate evaluation metrics\n",
    "print( metrics.accuracy_score(y_test, predicted))\n",
    "print( metrics.roc_auc_score(y_test, probs[:, 1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The accuracy is 73%, which is the same as we experienced when training and predicting on the same data.\n",
    "\n",
    "We can also see the confusion matrix and a classification report with other metrics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1169  134]\n",
      " [ 382  225]]\n",
      "             precision    recall  f1-score   support\n",
      "\n",
      "        0.0       0.75      0.90      0.82      1303\n",
      "        1.0       0.63      0.37      0.47       607\n",
      "\n",
      "avg / total       0.71      0.73      0.71      1910\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print (metrics.confusion_matrix(y_test, predicted))\n",
    "print (metrics.classification_report(y_test, predicted))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model Evaluation Using Cross-Validation\n",
    "\n",
    "Now let's try 10-fold cross-validation, to see if the accuracy holds up more rigorously."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 0.72100313  0.70219436  0.73824451  0.70597484  0.70597484  0.72955975\n",
      "  0.7327044   0.70440252  0.75157233  0.75      ]\n",
      "0.724163068551\n"
     ]
    }
   ],
   "source": [
    "# evaluate the model using 10-fold cross-validation\n",
    "scores = cross_val_score(LogisticRegression(), X, y, scoring='accuracy', cv=10)\n",
    "print(scores)\n",
    "print(scores.mean())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looks good. It's still performing at 73% accuracy."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Predicting the Probability of an Affair\n",
    "\n",
    "Just for fun, let's predict the probability of an affair for a random woman not present in the dataset. She's a 25-year-old teacher who graduated college, has been married for 3 years, has 1 child, rates herself as strongly religious, rates her marriage as fair, and her husband is a farmer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.77472221,  0.22527779]])"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.predict_proba(np.array([1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 3, 25, 3, 1, 4,\n",
    "                              16]).reshape(1,-1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The predicted probability of an affair is 23%."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Next Steps\n",
    "\n",
    "There are many different steps that could be tried in order to improve the model:\n",
    "\n",
    "* including interaction terms\n",
    "* removing features\n",
    "* regularization techniques\n",
    "* using a non-linear model"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
